What the F**k happened?

G

georget

To give a little background, I am running Win98se. The computer is a
1GH PIII Ibm computer which I have modified somewhat, and have 384
megs ram. I have two harddrives, partitioned as follows, C D on
drive 1, and E F G H on drive 2. Drive C is my OS and programs. D:
is where I keep the Windows98 CABS (entire CD) and where I store my
video files. E: is storage for my dig camera photos, and F: is a
small partition of only 2 gigs. That's where I have this newsreader
program, Agent 2, installed, as well as Firefox. The only other thing
on that partition are some dos programs and utilities. I keep Agent
in this small partition because Agent tends to cause lots of
fragmentation as I read, delete, and save usenet messages. Thus, it's
easy to defrag a small partition.

Now for the PROBLEM....
Last night I was reading some newsgroups. Then I installed an upgrade
for Firefox, from 2.0.0.3 to the latest version, which I think is
2.0.0.11. Both of these things were on drive F:. The Firefox upgrade
worked fine, and I loaded a few webpages with it. The last thing I
did was to retrieve the messages from a few newsgroups, before
shutting the computer off and going to bed.

This morning I turned on the computer and it booted to the Dos prompt.
(I always boot to dos and type "win" when I want windows). I typed
"WIN" and the computer froze up. I was making breakfast so it sat
that way for a half hour or more. I shut it off and restarted. It
told me there was no operating system to boot from, and I heard the
hard drives power down, but the computer fans still ran. (I DO NOT
use any power management at all). I shut off and on the computer
several times and had the same problem. I finally went to the BIOS
and reset it to DEFAULT, except shutting off the power management
again. While in the bios, it showed NO hard drives existed.
I rebooted and went to the dos prompt. I started windows and that
worked fine. I went online and opened Agent. I was reading the
messages (on this newsgroup), when I typed a reply to something. When
I clicked SEND, I got a blue screen and it said "Cannot write to drive
F:". Then everything froze up. I shut off the computer and once
again I could not boot and got the error "NO bootable drive"
(something like that). Once again I reset the bios and everything was
fine. I booted to dos and ran scandisk. all partitions were fine,
except F:. It said I could not read F:. Then said the fat table for
F: was corrupt, and I let scandisk do the fix for it. However, in the
end, everything on the whole partition was corrupt. I ended up with
over 100 .CHK files, and except for one directory, all the directory
names were changed to numbers. In other words, everything on that
partition was useless garbage. Scandisk kept re-running and finding
more and more errors. I finally just stopped it, and formatted the
partition. Nothing was really lost except the upgrade to Firefox, a
few bookmarks since my last backup, and all the newsgroup messages I
had saved in the past week (which are easy to replace).

After the format, I ran scandisk 3 times, the FULL scan. No drive
errors at all. Then I restored my backup for that partition, and
everything was fine until I opened Agent. Agent opened fine, but as
soon as I went online and started to retrieve messages, the computer
froze up again. This time when I rebooted, it could not find the
"boot drive" again, and once again I reset the bios to default.

Once booted, I reinstalled Agent on another partition (after doing a
complete backup). Then I did a FILE COMPARE, using the old Dos COMP
command, and found that the new Agent.exe was different than the one
from my backup. I zipped the old version and stashed it on a flash
drive. Then I reinstalled Agent from scratch on my F: partition.

Ever since I did that, I have had no further problems. I have
rebooted numerous times, gone on the newsgroups, ran Firefox (which is
back to the older version), and no further problems at all. I checked
all cables to insure they are tight as well as pushing ram sticks in
to insure they are tight.

What the F**K happened?
I have worked with computers for years, built many of them, and this
is just too weird. Either Agent.exe was corrupt, (but why would it
cause loss of FAT?), or the upgrade to Firefox caused a problem (yet
Firefox ran fine).

I'm thinking the battery may be weak, thus the loss of informnation,
but cant get one till Monday. Yet, the computer booted fine since....

This is just too bizarre.....
Anyone have any suggestions?

NO, I do not have any viruses. I checked that too.

I should also mention that something similar to this happened about a
week ago. That time the whole Agent directory was corrupt, but not
the rest of that partition. I deleted the Agent dir and replaced it
from my backup (which would be that same backup that appears to be a
bad .EXE).

I sure could use some help understanding this....

This was crossposted to 3 related newsgroups.
microsoft.public.win98.gen_discussion
alt.comp.hardware.homebuilt
alt.comp.hardware

Thanks

George T
 
K

kony

To give a little background, I am running Win98se. The computer is a
1GH PIII Ibm computer which I have modified somewhat, and have 384
megs ram.

Ok, but being someone who hacks at anything that begs for
it, i have to ask, what have you modified? Granted, many
things will not matter, but to cut to the chase, it might be
possible something that seemed like it wouldn't matter, did
matter.

I would also encourage you to use a more descriptive topic
title, so others who might have some knowledge about the
topic, and/or others who might seek the same resolution in
the future, will be able to weed through a large body of
data.


I have two harddrives, partitioned as follows, C D on
drive 1, and E F G H on drive 2. Drive C is my OS and programs. D:
is where I keep the Windows98 CABS (entire CD) and where I store my
video files. E: is storage for my dig camera photos, and F: is a
small partition of only 2 gigs. That's where I have this newsreader
program, Agent 2, installed, as well as Firefox. The only other thing
on that partition are some dos programs and utilities. I keep Agent
in this small partition because Agent tends to cause lots of
fragmentation as I read, delete, and save usenet messages. Thus, it's
easy to defrag a small partition.

I don't find agent to be much of a problem because even if
it's fragmented, the total size is small and the data rate
is not so demanding... a few KB of text here and there.

Now for the PROBLEM....
Last night I was reading some newsgroups. Then I installed an upgrade
for Firefox, from 2.0.0.3 to the latest version, which I think is
2.0.0.11. Both of these things were on drive F:. The Firefox upgrade
worked fine, and I loaded a few webpages with it. The last thing I
did was to retrieve the messages from a few newsgroups, before
shutting the computer off and going to bed.


I am dreading reading your next "paragraph". It needs a few
carriage returns. Maybe I'm just a cranky old fart, but on
usenet, maintaining an attention span of the readers is a
nice thing.


This morning I turned on the computer and it booted to the Dos prompt.
(I always boot to dos and type "win" when I want windows). I typed
"WIN" and the computer froze up.

Ok, next time if you have a working system, make a backup of
the OS partition with user data on another partition so a
backup can be quickly restored. With win9x especially,
being it is a more fragile OS and that the OS takes up less
space for a backup, it is a really good idea.

Next I will wonder, why had you set it up to boot DOS then
user interaction to boot windows? Most things you'd want to
do are accessible in win9x, since it does essentially run on
DOS.


I was making breakfast so it sat
that way for a half hour or more. I shut it off and restarted. It
told me there was no operating system to boot from, and I heard the
hard drives power down, but the computer fans still ran.

At this point I would be holding my breathe, If I hadn't
made a backup, thinking the HDD was toast (but, I haven't
read the rest of your gigantic paragraph yet).

As such, I would run scandisk and if that didn't complete,
then running the HDD manufacturer's utils.

If your hard drive is as old as the OS, NOW is the time to
replace it, copying off any files you have if they are
readable. A win98 era system's hard drive is beyond it's
expected lifespan already, if it were new when win98
shipped.


(I DO NOT
use any power management at all). I shut off and on the computer
several times and had the same problem. I finally went to the BIOS
and reset it to DEFAULT, except shutting off the power management
again. While in the bios, it showed NO hard drives existed.

This tends to implicate the HDD, failing, if nothing else
had changed. Just for kicks I would change the HDD data
cable, because these insulation displacement cables aren't
designed to work well for many years, but if the system had
no other changes which you haven't claimed, it would seem
more likely a drive failure (but again, I haven't read the
rest of your post yet).



I rebooted and went to the dos prompt. I started windows and that
worked fine. I went online and opened Agent. I was reading the
messages (on this newsgroup), when I typed a reply to something. When
I clicked SEND, I got a blue screen and it said "Cannot write to drive
F:". Then everything froze up.

Before I speculate more than I already have, more info about
the variables I mentioned already would help.

Also at this point, I should ask, "do you have a backup of
the system at the last point it worked stabily"?

I think HDD failure is more likely, but gross failure of
other subsystems are not impossible... but they would tend
to cause problems in more areas than just Agent.



I shut off the computer and once
again I could not boot and got the error "NO bootable drive"
(something like that).

This is a really good sign it is not windows, not software
at all, it is a drive or connected cabling problem. Rarely,
a PSU problem can also cause this, but we only have a lot of
software issues mentioned and this is a hardware group
without yet having a concise but complete list of all major
hardware.
Once again I reset the bios and everything was
fine. I booted to dos and ran scandisk. all partitions were fine,
except F:. It said I could not read F:. Then said the fat table for
F: was corrupt, and I let scandisk do the fix for it. However, in the
end, everything on the whole partition was corrupt.

Do you have a HDD over 128GB in size? Win98 has some issues
with that.

Also, just for kicks, you might run memtest86+ for a few
hours just to rule out memory errors. I don't think that is
the primary cause, but if you have them, trying to weed
through problems can be like banging your head against the
wall.
I ended up with
over 100 .CHK files, and except for one directory, all the directory
names were changed to numbers. In other words, everything on that
partition was useless garbage.

The conservative answer is to copy off any important data
because the drive may be nearly dead. AFTER you have done
this., you have some leisure to do more troubleshooting.

Scandisk kept re-running and finding
more and more errors. I finally just stopped it, and formatted the
partition. Nothing was really lost except the upgrade to Firefox, a
few bookmarks since my last backup, and all the newsgroup messages I
had saved in the past week (which are easy to replace).

After the format, I ran scandisk 3 times, the FULL scan. No drive
errors at all. Then I restored my backup for that partition, and
everything was fine until I opened Agent. Agent opened fine, but as
soon as I went online and started to retrieve messages, the computer
froze up again. This time when I rebooted, it could not find the
"boot drive" again, and once again I reset the bios to default.

Whatever might or might not be wrong with Agent, it has no
direct bearing on your system frezing when trying to boot.
Agent is only softwae cleared when a reset has been done, at
which point the bios and hardware fitness is the only thing
that matters. I think your HDD is progressively failing.


Once booted, I reinstalled Agent on another partition (after doing a
complete backup). Then I did a FILE COMPARE, using the old Dos COMP
command, and found that the new Agent.exe was different than the one
from my backup. I zipped the old version and stashed it on a flash
drive. Then I reinstalled Agent from scratch on my F: partition.


I will now refer back to the idea to run memtest86+, in case
you have memory errors causing gross file corruption.

Ever since I did that, I have had no further problems. I have
rebooted numerous times, gone on the newsgroups, ran Firefox (which is
back to the older version), and no further problems at all. I checked
all cables to insure they are tight as well as pushing ram sticks in
to insure they are tight.

What the F**K happened?
I have worked with computers for years, built many of them, and this
is just too weird. Either Agent.exe was corrupt, (but why would it
cause loss of FAT?), or the upgrade to Firefox caused a problem (yet
Firefox ran fine).

It is good to do CRC checks on suspect files.
If one is corrupt, I would primarily suspect a low rate of
memory errors. Some might say that stray cosmic rays coulod
cause it, and it is true, but IMO, less likely. At this
point I would backup everything again and do a lot of large
file copying and then checking originals against CRC
checksums of the duplicate.

You never did mention the hardware, sometimes it feels like
waste of time to not know this. For example, certain Via
chipsets would corrupt data, and while these were younger
than win98, there is no real way we can safely assume what
the hardware is.

I'm thinking the battery may be weak, thus the loss of informnation,
but cant get one till Monday. Yet, the computer booted fine since....

The battery will cause loss of bios settings, but should not
cause this.



This is just too bizarre.....
Anyone have any suggestions?

NO, I do not have any viruses. I checked that too.

I should also mention that something similar to this happened about a
week ago. That time the whole Agent directory was corrupt, but not
the rest of that partition. I deleted the Agent dir and replaced it
from my backup (which would be that same backup that appears to be a
bad .EXE).

I sure could use some help understanding this....

This was crossposted to 3 related newsgroups.
microsoft.public.win98.gen_discussion
alt.comp.hardware.homebuilt
alt.comp.hardware

IF nothing I mentioned above, and I mean all things, don't
help, I would try doing a clean installation of win98 and
agent, nothing else yet but the manditory drivers, then
seeing if the problem persists.
 
S

SteveH

To give a little background, I am running Win98se. The computer is a
1GH PIII Ibm computer which I have modified somewhat, and have 384
megs ram. I have two harddrives, partitioned as follows, C D on
drive 1, and E F G H on drive 2. Drive C is my OS and programs. D:
is where I keep the Windows98 CABS (entire CD) and where I store my
video files. E: is storage for my dig camera photos, and F: is a
small partition of only 2 gigs. That's where I have this newsreader
program, Agent 2, installed, as well as Firefox. The only other thing
on that partition are some dos programs and utilities. I keep Agent
in this small partition because Agent tends to cause lots of
fragmentation as I read, delete, and save usenet messages. Thus, it's
easy to defrag a small partition.

Now for the PROBLEM....
Last night I was reading some newsgroups. Then I installed an upgrade
for Firefox, from 2.0.0.3 to the latest version, which I think is
2.0.0.11. Both of these things were on drive F:. The Firefox upgrade
worked fine, and I loaded a few webpages with it. The last thing I
did was to retrieve the messages from a few newsgroups, before
shutting the computer off and going to bed.

This morning I turned on the computer and it booted to the Dos prompt.
(I always boot to dos and type "win" when I want windows). I typed
"WIN" and the computer froze up. I was making breakfast so it sat
that way for a half hour or more. I shut it off and restarted. It
told me there was no operating system to boot from, and I heard the
hard drives power down, but the computer fans still ran. (I DO NOT
use any power management at all). I shut off and on the computer
several times and had the same problem. I finally went to the BIOS
and reset it to DEFAULT, except shutting off the power management
again. While in the bios, it showed NO hard drives existed.
I rebooted and went to the dos prompt. I started windows and that
worked fine. I went online and opened Agent. I was reading the
messages (on this newsgroup), when I typed a reply to something. When
I clicked SEND, I got a blue screen and it said "Cannot write to drive
F:". Then everything froze up. I shut off the computer and once
again I could not boot and got the error "NO bootable drive"
(something like that). Once again I reset the bios and everything was
fine. I booted to dos and ran scandisk. all partitions were fine,
except F:. It said I could not read F:. Then said the fat table for
F: was corrupt, and I let scandisk do the fix for it. However, in the
end, everything on the whole partition was corrupt. I ended up with
over 100 .CHK files, and except for one directory, all the directory
names were changed to numbers. In other words, everything on that
partition was useless garbage. Scandisk kept re-running and finding
more and more errors. I finally just stopped it, and formatted the
partition. Nothing was really lost except the upgrade to Firefox, a
few bookmarks since my last backup, and all the newsgroup messages I
had saved in the past week (which are easy to replace).

After the format, I ran scandisk 3 times, the FULL scan. No drive
errors at all. Then I restored my backup for that partition, and
everything was fine until I opened Agent. Agent opened fine, but as
soon as I went online and started to retrieve messages, the computer
froze up again. This time when I rebooted, it could not find the
"boot drive" again, and once again I reset the bios to default.

Once booted, I reinstalled Agent on another partition (after doing a
complete backup). Then I did a FILE COMPARE, using the old Dos COMP
command, and found that the new Agent.exe was different than the one
from my backup. I zipped the old version and stashed it on a flash
drive. Then I reinstalled Agent from scratch on my F: partition.

Ever since I did that, I have had no further problems. I have
rebooted numerous times, gone on the newsgroups, ran Firefox (which is
back to the older version), and no further problems at all. I checked
all cables to insure they are tight as well as pushing ram sticks in
to insure they are tight.

What the F**K happened?
I have worked with computers for years, built many of them, and this
is just too weird. Either Agent.exe was corrupt, (but why would it
cause loss of FAT?), or the upgrade to Firefox caused a problem (yet
Firefox ran fine).

I'm thinking the battery may be weak, thus the loss of informnation,
but cant get one till Monday. Yet, the computer booted fine since....

This is just too bizarre.....
Anyone have any suggestions?

NO, I do not have any viruses. I checked that too.

I should also mention that something similar to this happened about a
week ago. That time the whole Agent directory was corrupt, but not
the rest of that partition. I deleted the Agent dir and replaced it
from my backup (which would be that same backup that appears to be a
bad .EXE).

I sure could use some help understanding this....

This was crossposted to 3 related newsgroups.
microsoft.public.win98.gen_discussion
alt.comp.hardware.homebuilt
alt.comp.hardware

Thanks

George T

Sounds like the hard drive in question is starting to fail.

SteveH
 
P

philo

To give a little background, I am running Win98se. The computer is a
1GH PIII Ibm computer which I have modified somewhat, and have 384
megs ram. I have two harddrives, partitioned as follows, C D on
drive 1, and E F G H on drive 2. Drive C is my OS and programs. D:
is where I keep the Windows98 CABS (entire CD) and where I store my
video files. E: is storage for my dig camera photos, and F: is a
small partition of only 2 gigs. That's where I have this newsreader
program, Agent 2, installed, as well as Firefox. The only other thing
on that partition are some dos programs and utilities. I keep Agent
in this small partition because Agent tends to cause lots of
fragmentation as I read, delete, and save usenet messages. Thus, it's
easy to defrag a small partition.

Now for the PROBLEM....
Last night I was reading some newsgroups. Then I installed an upgrade
for Firefox, from 2.0.0.3 to the latest version, which I think is
2.0.0.11. Both of these things were on drive F:. The Firefox upgrade
worked fine, and I loaded a few webpages with it. The last thing I
did was to retrieve the messages from a few newsgroups, before
shutting the computer off and going to bed.

This morning I turned on the computer and it booted to the Dos prompt.
(I always boot to dos and type "win" when I want windows). I typed
"WIN" and the computer froze up. I was making breakfast so it sat
that way for a half hour or more. I shut it off and restarted. It
told me there was no operating system to boot from, and I heard the
hard drives power down, but the computer fans still ran. (I DO NOT
use any power management at all). I shut off and on the computer
several times and had the same problem. I finally went to the BIOS
and reset it to DEFAULT, except shutting off the power management
again. While in the bios, it showed NO hard drives existed.
I rebooted and went to the dos prompt. I started windows and that
worked fine. I went online and opened Agent. I was reading the
messages (on this newsgroup), when I typed a reply to something. When
I clicked SEND, I got a blue screen and it said "Cannot write to drive
F:". Then everything froze up. I shut off the computer and once
again I could not boot and got the error "NO bootable drive"
(something like that). Once again I reset the bios and everything was
fine. I booted to dos and ran scandisk. all partitions were fine,
except F:. It said I could not read F:. Then said the fat table for
F: was corrupt, and I let scandisk do the fix for it. However, in the
end, everything on the whole partition was corrupt. I ended up with
over 100 .CHK files, and except for one directory, all the directory
names were changed to numbers. In other words, everything on that
partition was useless garbage. Scandisk kept re-running and finding
more and more errors. I finally just stopped it, and formatted the
partition. Nothing was really lost except the upgrade to Firefox, a
few bookmarks since my last backup, and all the newsgroup messages I
had saved in the past week (which are easy to replace).

After the format, I ran scandisk 3 times, the FULL scan. No drive
errors at all. Then I restored my backup for that partition, and
everything was fine until I opened Agent. Agent opened fine, but as
soon as I went online and started to retrieve messages, the computer
froze up again. This time when I rebooted, it could not find the
"boot drive" again, and once again I reset the bios to default.

Once booted, I reinstalled Agent on another partition (after doing a
complete backup). Then I did a FILE COMPARE, using the old Dos COMP
command, and found that the new Agent.exe was different than the one
from my backup. I zipped the old version and stashed it on a flash
drive. Then I reinstalled Agent from scratch on my F: partition.

Ever since I did that, I have had no further problems. I have
rebooted numerous times, gone on the newsgroups, ran Firefox (which is
back to the older version), and no further problems at all. I checked
all cables to insure they are tight as well as pushing ram sticks in
to insure they are tight.

What the F**K happened?
I have worked with computers for years, built many of them, and this
is just too weird. Either Agent.exe was corrupt, (but why would it
cause loss of FAT?), or the upgrade to Firefox caused a problem (yet
Firefox ran fine).

I'm thinking the battery may be weak, thus the loss of informnation,
but cant get one till Monday. Yet, the computer booted fine since....


First thing to do is go to the website of your harddrive manufacturer and
download their disk diagnostic utility...
if it shows any errors at all...replace the drive at once!

BTW: Also check your cables...possibly a bad cable or poor connection...(but
not likely)
 
E

ElJerid

First thing to do is go to the website of your harddrive manufacturer and
download their disk diagnostic utility...
if it shows any errors at all...replace the drive at once!

BTW: Also check your cables...possibly a bad cable or poor
connection...(but
not likely)
Encountered exactly the same symptoms a few weeks ago on a customer PC. What
retains my attention is "a computer which I have modified somewhat", because
my customer had the same words.
In my case, the modification was just the add of a second hard drive. Which
resulted in daily blue screens or PC stops, mostly when browsing the net.

My first idea was also th check for IE or Windows errors, but I decided
first to check all connections. They were ok, but the customer had to add a
power supply splitter cable, as not enough connectors were available for the
nes disc. The splitter came from an old components box and seemed to have
oxydated contact pins. I replaced it by a new one.
The problem was solved, not a single blue screen since then.

I mention this because no mistake was done while replacing, formating or
partitionig the drive. Just an old cable splitter...
 
T

thanatoid

(e-mail address removed) wrote in
To give a little background, I am running Win98se. The
computer is a 1GH PIII Ibm computer which I have modified

<SNIP>

I read this with interest, since I believe occasionally our
computers are visited by the dyevyll himself (herself, more
likely).

I have a WORSE story, if you can believe it, but it would take
too long to write down. Suffice it to say it almost drove me out
of my mind, took over 2 months to fix (which included buying -
unnecessarily - a new motherboard, PSU, etc.) only to finally
find out the problem was a very basic and simple program which
decided to act like a virus during a particular month in that
particular year (on 2 completely clean installs on 2 computers,
and when I changed the date, it was FINE on BOTH). I had been
using it for about 5 years before it happened.

I was curious what replies you would get. Aside from the demonic
interference (I firmly believe it does happen - but ONLY with
computers) I think the suggestion to get a HD verification
utility from the manufacturer's web site is the first thing to
do. I have cables more than 10 years old and they are fine.
OTOH, I /have/ learned over the years with everything from lamps
to pro video, that it is (almost) always the cable. Since they
cost almost nothing, why not change them. But do the HD check.

PLEASE post if and when you figure it out. I love these stories
- I know 1st hand they're maddening, see above - still, very
interesting.
 
G

georget

Ok, but being someone who hacks at anything that begs for
it, i have to ask, what have you modified? Granted, many
things will not matter, but to cut to the chase, it might be
possible something that seemed like it wouldn't matter, did
matter.

I would also encourage you to use a more descriptive topic
title, so others who might have some knowledge about the
topic, and/or others who might seek the same resolution in
the future, will be able to weed through a large body of
data.




I don't find agent to be much of a problem because even if
it's fragmented, the total size is small and the data rate
is not so demanding... a few KB of text here and there.




I am dreading reading your next "paragraph". It needs a few
carriage returns. Maybe I'm just a cranky old fart, but on
usenet, maintaining an attention span of the readers is a
nice thing.




Ok, next time if you have a working system, make a backup of
the OS partition with user data on another partition so a
backup can be quickly restored. With win9x especially,
being it is a more fragile OS and that the OS takes up less
space for a backup, it is a really good idea.

Next I will wonder, why had you set it up to boot DOS then
user interaction to boot windows? Most things you'd want to
do are accessible in win9x, since it does essentially run on
DOS.




At this point I would be holding my breathe, If I hadn't
made a backup, thinking the HDD was toast (but, I haven't
read the rest of your gigantic paragraph yet).

As such, I would run scandisk and if that didn't complete,
then running the HDD manufacturer's utils.

If your hard drive is as old as the OS, NOW is the time to
replace it, copying off any files you have if they are
readable. A win98 era system's hard drive is beyond it's
expected lifespan already, if it were new when win98
shipped.




This tends to implicate the HDD, failing, if nothing else
had changed. Just for kicks I would change the HDD data
cable, because these insulation displacement cables aren't
designed to work well for many years, but if the system had
no other changes which you haven't claimed, it would seem
more likely a drive failure (but again, I haven't read the
rest of your post yet).





Before I speculate more than I already have, more info about
the variables I mentioned already would help.

Also at this point, I should ask, "do you have a backup of
the system at the last point it worked stabily"?

I think HDD failure is more likely, but gross failure of
other subsystems are not impossible... but they would tend
to cause problems in more areas than just Agent.





This is a really good sign it is not windows, not software
at all, it is a drive or connected cabling problem. Rarely,
a PSU problem can also cause this, but we only have a lot of
software issues mentioned and this is a hardware group
without yet having a concise but complete list of all major
hardware.


Do you have a HDD over 128GB in size? Win98 has some issues
with that.

Also, just for kicks, you might run memtest86+ for a few
hours just to rule out memory errors. I don't think that is
the primary cause, but if you have them, trying to weed
through problems can be like banging your head against the
wall.


The conservative answer is to copy off any important data
because the drive may be nearly dead. AFTER you have done
this., you have some leisure to do more troubleshooting.



Whatever might or might not be wrong with Agent, it has no
direct bearing on your system frezing when trying to boot.
Agent is only softwae cleared when a reset has been done, at
which point the bios and hardware fitness is the only thing
that matters. I think your HDD is progressively failing.





I will now refer back to the idea to run memtest86+, in case
you have memory errors causing gross file corruption.



It is good to do CRC checks on suspect files.
If one is corrupt, I would primarily suspect a low rate of
memory errors. Some might say that stray cosmic rays coulod
cause it, and it is true, but IMO, less likely. At this
point I would backup everything again and do a lot of large
file copying and then checking originals against CRC
checksums of the duplicate.

You never did mention the hardware, sometimes it feels like
waste of time to not know this. For example, certain Via
chipsets would corrupt data, and while these were younger
than win98, there is no real way we can safely assume what
the hardware is.



The battery will cause loss of bios settings, but should not
cause this.





IF nothing I mentioned above, and I mean all things, don't
help, I would try doing a clean installation of win98 and
agent, nothing else yet but the manditory drivers, then
seeing if the problem persists.


First, thanks for the reply. I know this was a long message, but I
had to explain the whole thing.

To answer your questions. What I changed on this computer are as
follows.

I. The computer came with Win2K installed on a 10 gig drive. I
replaced that drive with TWO 20 gig hard drives, and installed Win98.
(Win98 is all I will use, I cant stand anything newer than ME, and
would use WinME but I heard too many stories about problems with it)

2. I added another 256meg ram stick. It came with 128megs Now i have
both.

3. I added a USB 2.0 card.

4. I removed the internal CD drive, and replaced it with an external
USB CD burner

Otherwise it's the same.....

You asked about the hardware.....
The computer is an IBM Netvista 6341

I copied the following from Norton System info.
*Bios IBM 03/17/05
*Processor GenuineIntel Family 6 Model 8 1002
*Video 800x600 in True Color, Intel(R) 82810E Graphics Controller Ver.
4.0
*Memory 382 MB
*Bus Type PCI
*Hard Drives 19.14 GB, 18.65 GB

-----
What I did so far.......
I copied all partitions into separate directories on a 3rd (spare) 20
gig drive, so I have a complete backup.

Reset bios to default, and disabled all Power Savers and screen
savers.

I pushed all internal cables and memory to be sure they were plugged
in well. I removed some dust at the same time. I also pushed all
external cables to check they were tight.

Ran Ad-Aware, AVG Free, and Spybot. No viruses found, except one zip
file which I downloaded recently (and did not open) contained spyware.
I deleted it.

I ran RegSeeker and cleaned the Registry
I ran Defrag and defragged C: and F: (the others were ok)

----
Now for a few more questions.

1. How do I do CRC checks?

2. Where do I get memtest86+?

3. If I do have a failing hard drive, which one would it be?
The problem partition is F: which is on my SECOND drive.
Yet, when the computer would not boot, the boot partition is C: on my
FIRST drive?

----
I'm seriously thinking about the cable after reading your reply. That
is the one common link to BOTH drives. Of course there's the
harddrive controller, but thats part of the motherboard.

Thanks to all
George T

######## ................. #########
 
T

thanatoid

Encountered exactly the same symptoms a few weeks ago on a
customer PC. What retains my attention is "a computer which
I have modified somewhat", because my customer had the same
words. In my case, the modification was just the add of a
second hard drive. Which resulted in daily blue screens or
PC stops, mostly when browsing the net.

My first idea was also th check for IE or Windows errors,
but I decided first to check all connections. They were ok,
but the customer had to add a power supply splitter cable,
as not enough connectors were available for the nes disc.
The splitter came from an old components box and seemed to
have oxydated contact pins. I replaced it by a new one.
The problem was solved, not a single blue screen since
then.

I mention this because no mistake was done while replacing,
formating or partitionig the drive. Just an old cable
splitter...

Hee hee...

"It's always the cable"
- thanatoid
 
K

kony

First, thanks for the reply. I know this was a long message, but I
had to explain the whole thing.

To answer your questions. What I changed on this computer are as
follows.

I. The computer came with Win2K installed on a 10 gig drive. I
replaced that drive with TWO 20 gig hard drives, and installed Win98.
(Win98 is all I will use, I cant stand anything newer than ME, and
would use WinME but I heard too many stories about problems with it)

yes WinME has some problems but it is largely in features it
adds to Win98, features that can be disabled. Regardless,
using win98 would not cause the problem, and is a reasonable
choice for someone needing DOS support.

2. I added another 256meg ram stick. It came with 128megs Now i have
both.

It would be good to run memtest86+ for several hours, even
overnight or longer to see if you have low rate memory
errors.
3. I added a USB 2.0 card.

What chipset does the motherboard use? If a (certain models
of) via chipset, this added device on the PCI bus could be a
cause of data corruption.

4. I removed the internal CD drive, and replaced it with an external
USB CD burner

Personally, I would pull the burner out of the external
enclosure and put it in the system. It will be faster and
much lower overhead in use. If you do that, just make sure
to go into Device Manager's properties for the drive and set
it to use DMA.

Otherwise it's the same.....

You asked about the hardware.....
The computer is an IBM Netvista 6341

I copied the following from Norton System info.
*Bios IBM 03/17/05
*Processor GenuineIntel Family 6 Model 8 1002
*Video 800x600 in True Color, Intel(R) 82810E Graphics Controller Ver.
4.0
*Memory 382 MB
*Bus Type PCI
*Hard Drives 19.14 GB, 18.65 GB

Ok, it's an Intel 810 chipset based board, does not have a
Via chipset. PCI throughput shouldn't be a problem with a
mere USB2 card added, but I would wonder if changing the PCI
card's slot might help.

That wouldn't account for all the issues you're seeing, but
I now wonder if you have more than one problem... and there
has been more than one thing changed. I would definitely
run memtest86+ for several hours before doing anything else
to be sure the memory is at least reasonably stable.

What I did so far.......
I copied all partitions into separate directories on a 3rd (spare) 20
gig drive, so I have a complete backup.

Ok, but until you have ran memtest86+ for many hours without
any errors, be very cautious about depending on this backup,
since as you have already noted the Agent file was different
so something may be preventing data integrity.
Reset bios to default, and disabled all Power Savers and screen
savers.

I pushed all internal cables and memory to be sure they were plugged
in well. I removed some dust at the same time. I also pushed all
external cables to check they were tight.

Also inspect motherboard capacitors, and if you had a spare
power supply it would be something else to try... it seems
less likely than other suspects but when a system gets older
it is hard to play odds anymore, many things taken for
granted with a newer system may need checked.

Ran Ad-Aware, AVG Free, and Spybot. No viruses found, except one zip
file which I downloaded recently (and did not open) contained spyware.
I deleted it.

I ran RegSeeker and cleaned the Registry

Usually, registry cleaners are unnecessary and rarely they
can do more harm than good. If all else fails, try a clean
installation of Win98.

I ran Defrag and defragged C: and F: (the others were ok)

You should not defrag any drives until you know if your
memory subsystem is stable! If you have memory errors you
may have now corrupted many many files.

One way is a shell extension, where you right-click the file
and it shows the CRC value. Frankly I can't remember which
software supports win98 so it may be more time consuming to
find some utilities or plugins to do it.

I hope the following is what I'm thinking of, as I
downloaded it many years ago not recently.

http://www.freewareweb.com/cgi-bin/archive.cgi?ID=629

Yes, after checking the contents of the file it will do the
job, but I think it may not support files over some size
(maybe 2GB?)

There was such an explosion of freeware software in the past
few years, there is probably something very nice where you
can point the software to two directories of files and have
it recursively check all of the files instead of doing each
one manually. Here's one that seems to do that, has a
free 30 day trial. I'm sure a Google search will turn up
more options, I can't recall the name of the one I was using
on win98.

http://www.tgrmn.com/index.htm

2. Where do I get memtest86+?
http://www.memtest.org/



3. If I do have a failing hard drive, which one would it be?
The problem partition is F: which is on my SECOND drive.
Yet, when the computer would not boot, the boot partition is C: on my
FIRST drive?


A failing drive can cause the system to not boot a different
connected drive.

Unplug the questionable drive, rejumper the other drive for
master/slave IF needed, then see if the bootable drive
boots.

If it still does not, run the hard drive manufacturer's
utilities, but first I would do two things:

1) Run memtest86+ to see if memory is ok, with the hard
drives unplugged from power and data cables. If there are
errors, AND since you might have a failing drive, I would do
the most expedient thing next - remove the new memory module
then retest with memtest86+, still having no hard drives
running.

2) If it then passes memtest86+, connect the drives and
proceed to copy off any important data before stressing the
drives further by running the HDD manufacturer's
diagnostics. The most conservative suggestion is to do
these two things before running the system in windows to do
anything else, including not running the CRC checks yet.
"Maybe" your drive isn't failing and this is just a waste of
time - but if the drive is failing it might be important.




Hard drive controllers don't generally fail, but you might
consider using a 2nd cable to connect one or the other of
the drives, then jumpering each to master (or single, for
some IBM or WD drives as noted on their labels). Using a
separate cable per drive should improve performance copying
back and forth between them.

However, while a bad cable might prevent initial detection
or booting from the drive, it should not cause corrupt data
later as these 20GB drives would have checksums for data
sent through the cable, if the data were corrupted I would
think it most likey due to memory errors, or possibly some
lost sectors since these are now fairly old drives.

On that topic, the drives are at the end of their expected
lifespan. While you have gotten a lot of miles out of that
Win98 system, it might be time to consider some changes. If
you must still use Win98, I would at least consider getting
a new 120GB hard drive to replace one of these old, much
slower drives.
 
G

georget

yes WinME has some problems but it is largely in features it
adds to Win98, features that can be disabled. Regardless,
using win98 would not cause the problem, and is a reasonable
choice for someone needing DOS support.



It would be good to run memtest86+ for several hours, even
overnight or longer to see if you have low rate memory
errors.


What chipset does the motherboard use? If a (certain models
of) via chipset, this added device on the PCI bus could be a
cause of data corruption.



Personally, I would pull the burner out of the external
enclosure and put it in the system. It will be faster and
much lower overhead in use. If you do that, just make sure
to go into Device Manager's properties for the drive and set
it to use DMA.



Ok, it's an Intel 810 chipset based board, does not have a
Via chipset. PCI throughput shouldn't be a problem with a
mere USB2 card added, but I would wonder if changing the PCI
card's slot might help.

That wouldn't account for all the issues you're seeing, but
I now wonder if you have more than one problem... and there
has been more than one thing changed. I would definitely
run memtest86+ for several hours before doing anything else
to be sure the memory is at least reasonably stable.



Ok, but until you have ran memtest86+ for many hours without
any errors, be very cautious about depending on this backup,
since as you have already noted the Agent file was different
so something may be preventing data integrity.


Also inspect motherboard capacitors, and if you had a spare
power supply it would be something else to try... it seems
less likely than other suspects but when a system gets older
it is hard to play odds anymore, many things taken for
granted with a newer system may need checked.



Usually, registry cleaners are unnecessary and rarely they
can do more harm than good. If all else fails, try a clean
installation of Win98.



You should not defrag any drives until you know if your
memory subsystem is stable! If you have memory errors you
may have now corrupted many many files.


One way is a shell extension, where you right-click the file
and it shows the CRC value. Frankly I can't remember which
software supports win98 so it may be more time consuming to
find some utilities or plugins to do it.

I hope the following is what I'm thinking of, as I
downloaded it many years ago not recently.

http://www.freewareweb.com/cgi-bin/archive.cgi?ID=629

Yes, after checking the contents of the file it will do the
job, but I think it may not support files over some size
(maybe 2GB?)

There was such an explosion of freeware software in the past
few years, there is probably something very nice where you
can point the software to two directories of files and have
it recursively check all of the files instead of doing each
one manually. Here's one that seems to do that, has a
free 30 day trial. I'm sure a Google search will turn up
more options, I can't recall the name of the one I was using
on win98.

http://www.tgrmn.com/index.htm




A failing drive can cause the system to not boot a different
connected drive.

Unplug the questionable drive, rejumper the other drive for
master/slave IF needed, then see if the bootable drive
boots.

If it still does not, run the hard drive manufacturer's
utilities, but first I would do two things:

1) Run memtest86+ to see if memory is ok, with the hard
drives unplugged from power and data cables. If there are
errors, AND since you might have a failing drive, I would do
the most expedient thing next - remove the new memory module
then retest with memtest86+, still having no hard drives
running.

2) If it then passes memtest86+, connect the drives and
proceed to copy off any important data before stressing the
drives further by running the HDD manufacturer's
diagnostics. The most conservative suggestion is to do
these two things before running the system in windows to do
anything else, including not running the CRC checks yet.
"Maybe" your drive isn't failing and this is just a waste of
time - but if the drive is failing it might be important.





Hard drive controllers don't generally fail, but you might
consider using a 2nd cable to connect one or the other of
the drives, then jumpering each to master (or single, for
some IBM or WD drives as noted on their labels). Using a
separate cable per drive should improve performance copying
back and forth between them.

However, while a bad cable might prevent initial detection
or booting from the drive, it should not cause corrupt data
later as these 20GB drives would have checksums for data
sent through the cable, if the data were corrupted I would
think it most likey due to memory errors, or possibly some
lost sectors since these are now fairly old drives.

On that topic, the drives are at the end of their expected
lifespan. While you have gotten a lot of miles out of that
Win98 system, it might be time to consider some changes. If
you must still use Win98, I would at least consider getting
a new 120GB hard drive to replace one of these old, much
slower drives.

Hey Kony

I got the Western Digital diagnostics for the questionable drive.
(partition F)
It tested fine. I ran it 3 times.

As far as my boot drive, Its a Quantum Fireball 20g. Quantum no
longer exists. Maxtor bought them, then Seagate bought Maxtor, so
where I'd find a factory disgnostic for that one, seems near
impossible, unless someone happens to have saved it on disk and could
upload it.

Then I got memtest86+
How the f**k do I run that thing?
It seems like something only a guy with a 4 year college degree in
computers could understand. Their website is not very useful either.
Seems all they do is advertise to sell their CD.

Yes, I made the floppy and rebooted, that was easy.
After that, I have no clue what to select and can not tell what is
being tested when.

I let it just start on it's own and run. I almost immediately get
errors, and after several runs, I always get about 20 to 23 of them
showing. However, I think it;s testing the cache, not the ram.
This might be a good tester, but it's surely not user friendly.
I hit the "C" for options, I have no clue which tests to select.
I am no expert, but I am faurly knowledgable about computers. This
software has me lost.

I removed all but one ram strip. I got those same errors.
Then I swapped to my other strip, and got the same errors.
Then I removed both of them and installed a smaller one from my parts
box, and got the same errors.

I'm sort of thinking it's testing the cache, not the actual ram. I
finally selected test 3. Assuming option 1 was L1 cache
option 2 was L2 cache, and option 3 was the ram. In all honesty, I
have no clue whats running or doing what, except option 3 took forever
to run and I had to cancel so I could use the computer. I'll have to
run that during the night. But in the end I still wont know what was
tested or what is in error.

I wrote doen the following.
L1 cache 32K
L2 cache 128K
Memory 128 meg (or whatever ram I have plugged in).

It shows the computer as Celeron 1002mhz
Chipset Intel i810e

This is Memtest86+ v1.70

All I can assume is the first test is the L1 cache, which seems to be
where it gets the errors almost immediately. Where the heck is L1
cache memory plugged in? I dont see any other removable ram.

Then again, in all honesty, I dont have the slightest clue what this
softwware is doing, so it's useless to me. Whatever is in error, I
have no clue. No matter what ram strips I use, I get the same errors,
and I am sure they are not all bad.

Right now I am running windows on one 64meg strip, since it's the last
one I plugged in. It seems like its easier to just replace the ram
than run tests that dont tell me anything useful.

Isnt these some easier to use memory testing software, that gives the
results in plain and understandable english?

Thanks

George
 
K

kony

Hey Kony

I got the Western Digital diagnostics for the questionable drive.
(partition F)
It tested fine. I ran it 3 times.

As far as my boot drive, Its a Quantum Fireball 20g. Quantum no
longer exists. Maxtor bought them, then Seagate bought Maxtor, so
where I'd find a factory disgnostic for that one, seems near
impossible, unless someone happens to have saved it on disk and could
upload it.

I had some quantum designed, maxtor labeled 20GB drives that
looked like this:
http://69.36.166.207/usr_1034/maxtor_d740x.jpg

The Maxtor diagnostics worked with them, you might try those
from Maxtor.


Then I got memtest86+
How the f**k do I run that thing?
It seems like something only a guy with a 4 year college degree in
computers could understand. Their website is not very useful either.
Seems all they do is advertise to sell their CD.

Yes, I made the floppy and rebooted, that was easy.
After that, I have no clue what to select and can not tell what is
being tested when.

You make the boot floppy, boot the system to that, and it
runs the default test without any user interaction
necessary.

It goes through all the tests, then starts over going
through all the tests again and again in a perpetual loop.
That is what you want, to let it loop for many hours... If I
suspected a system might have memory errors I would at least
run it for 24 hours unless the system had only very little,
quite fast memory so it had made more loops than usual.

The important part is that it indicates no errors. If it
does indicate any you need to resolve that before worrying
about hard drives or software, etc. You should actually not
even boot windows if it indicates errors as that can result
in file corruption, though you might already have file
corruption from running OS and especailly doing the defrags,
"IF" the memory is instable.


I let it just start on it's own and run. I almost immediately get
errors, and after several runs, I always get about 20 to 23 of them
showing. However, I think it;s testing the cache, not the ram.
This might be a good tester, but it's surely not user friendly.
I hit the "C" for options, I have no clue which tests to select.
I am no expert, but I am faurly knowledgable about computers. This
software has me lost.

It tests cached and uncached - all the memory except the
tiny portion occupied by the software. It is almost too
user friendly in that you don't need to do anything but boot
it and wait to see if there are errors, leaving it running
unattended though once you start seeing errors you might as
well stop the test as it is already a sign the memory
instability has to be resolved however necessary, then
repeat the test to confirm the solution worked.

So you do have memory errors... Pull out one memory module
and retest, then test with only the other memory module
installed. It could be that is physically bad, or incapable
of the timings the bios is using, or it could be an
instability from bus loading when both modules are
installed. Sometimes a resolution is manually setting
slower memory timings, or if the memory bus is running at
100MHz, reducing that to 66MHz. I don't recall if you have
a P3 or Celeron processor, but with the Celeron the
reduction to 66MHz memory bus isn't such a penalty except
when it comes to video performance... but ultimately if
performance is very important, switching to more modern
parts is the obvious solution.



I removed all but one ram strip. I got those same errors.
Then I swapped to my other strip, and got the same errors.
Then I removed both of them and installed a smaller one from my parts
box, and got the same errors.

Hmm. I should read ahead in posts more often. I suspect it
is either testing a portion of memory that the motherboard
has reserved (someone may correct me about this, I don't
recall what will happen if it tries to do that), or the
motherboard itself is failing.

Are the errors always at the same addresses? Given three
different modules ran by themselves, always having errors at
same addresses would tend to suggest it is a motherboard
reserved memory issue, not actual errors per se - except
there could also be errors not always in the same addresses,
a combination of both reserved memory addressses and actual
instability due to motherboard failure. Without seeing what
it reported, it is a bit hard to speculate. Perhaps the
memtest86+ author could have included helpfiles to better
describe where to go when seeing results in the program but
there are so many variables it may be better left to the one
running the test, where to go next in trying to resolve the
problem.

I'm sort of thinking it's testing the cache, not the actual ram. I
finally selected test 3. Assuming option 1 was L1 cache
option 2 was L2 cache, and option 3 was the ram. In all honesty, I
have no clue whats running or doing what, except option 3 took forever
to run and I had to cancel so I could use the computer. I'll have to
run that during the night. But in the end I still wont know what was
tested or what is in error.

It is not only testing cache. The complete testing runs
through all tests and repeats, shows a number of "passes",
IIRC, and it would need to run for several hours to find
some errors if a system were barely instable.


I wrote doen the following.
L1 cache 32K
L2 cache 128K
Memory 128 meg (or whatever ram I have plugged in).

It shows the computer as Celeron 1002mhz
Chipset Intel i810e

This is Memtest86+ v1.70

All I can assume is the first test is the L1 cache, which seems to be
where it gets the errors almost immediately. Where the heck is L1
cache memory plugged in? I dont see any other removable ram.

No, forget about the idea that it's only testing L1 cache
because it is not. However, for informational purposes, the
L1 and L2 cache is integrated onto the die of the processor
itself. If you had such significant errors in the L1 and L2
cache (which is very unlikely by itself unless the processor
were overclocked), the system wouldn't be running at all.
Then again, in all honesty, I dont have the slightest clue what this
softwware is doing, so it's useless to me. Whatever is in error, I
have no clue. No matter what ram strips I use, I get the same errors,
and I am sure they are not all bad.

Right now I am running windows on one 64meg strip, since it's the last
one I plugged in. It seems like its easier to just replace the ram
than run tests that dont tell me anything useful.

Isnt these some easier to use memory testing software, that gives the
results in plain and understandable english?

Memtest86+ is doing exactly what it should, indicating
errors. Give us feedback about what I'd mentioned above,
whether _all_ of these errors are always at the same memory
locations, or since many many errors will be scrolling down
the screen without a way to see them all, at least whether
it seems that they always occur at the exact same addresses.

If they do, I think memtest is testing an area it shouldn't
and you should use the user settings to specify not to test
this memory range, but still running all tests for several
hours. If it can't run without a single error the board may
be dying... but more feedback would help to determine this.
 
M

~misfit~

Somewhere on teh intarweb "kony" typed:

It could be that is physically bad, or incapable
of the timings the bios is using

Memtest86+ is doing exactly what it should, indicating
errors.

If it can't run without a single error the board may
be dying... but more feedback would help to determine this.

This is my opinion too. Either that or, as you mentioned Kony, the timings
set in BIOS are incorrect for the RAM type. /My/ next course of action would
be to relax the RAM timings in BIOS and re-run tests to see if it makes a
difference.

Hopefully that'll do it. However, with hardware that age, from the era of
"bad caps", there's a good chance that the RAM slots (on-board) power supply
circuitry could be adversely effected by an out-of-spec capacitor, maybe
giving too low a voltage, or excessive ripple.
 
M

Marc

Hey Kony

I got the Western Digital diagnostics for the questionable drive.
(partition F)
It tested fine. I ran it 3 times.

As far as my boot drive, Its a Quantum Fireball 20g. Quantum no
longer exists. Maxtor bought them, then Seagate bought Maxtor, so
where I'd find a factory disgnostic for that one, seems near
impossible, unless someone happens to have saved it on disk and could
upload it.

Then I got memtest86+
How the f**k do I run that thing?
It seems like something only a guy with a 4 year college degree in
computers could understand. Their website is not very useful either.
Seems all they do is advertise to sell their CD.

Yes, I made the floppy and rebooted, that was easy.
After that, I have no clue what to select and can not tell what is
being tested when.

I let it just start on it's own and run. I almost immediately get
errors, and after several runs, I always get about 20 to 23 of them
showing. However, I think it;s testing the cache, not the ram.
This might be a good tester, but it's surely not user friendly.
I hit the "C" for options, I have no clue which tests to select.
I am no expert, but I am faurly knowledgable about computers. This
software has me lost.

I removed all but one ram strip. I got those same errors.
Then I swapped to my other strip, and got the same errors.
Then I removed both of them and installed a smaller one from my parts
box, and got the same errors.

I'm sort of thinking it's testing the cache, not the actual ram. I
finally selected test 3. Assuming option 1 was L1 cache
option 2 was L2 cache, and option 3 was the ram. In all honesty, I
have no clue whats running or doing what, except option 3 took forever
to run and I had to cancel so I could use the computer. I'll have to
run that during the night. But in the end I still wont know what was
tested or what is in error.

I wrote doen the following.
L1 cache 32K
L2 cache 128K
Memory 128 meg (or whatever ram I have plugged in).

It shows the computer as Celeron 1002mhz
Chipset Intel i810e

This is Memtest86+ v1.70

All I can assume is the first test is the L1 cache, which seems to be
where it gets the errors almost immediately. Where the heck is L1
cache memory plugged in? I dont see any other removable ram.

Then again, in all honesty, I dont have the slightest clue what this
softwware is doing, so it's useless to me. Whatever is in error, I
have no clue. No matter what ram strips I use, I get the same errors,
and I am sure they are not all bad.

Right now I am running windows on one 64meg strip, since it's the last
one I plugged in. It seems like its easier to just replace the ram
than run tests that dont tell me anything useful.

Isnt these some easier to use memory testing software, that gives the
results in plain and understandable english?

Thanks

George

There is DocMemory :
http://www.pcworld.com/downloads/file/fid,20541-order,1-page,1-c,alldownloads/description.html
 
G

georget

I had some quantum designed, maxtor labeled 20GB drives that
looked like this:
http://69.36.166.207/usr_1034/maxtor_d740x.jpg

The Maxtor diagnostics worked with them, you might try those
from Maxtor.




You make the boot floppy, boot the system to that, and it
runs the default test without any user interaction
necessary.

It goes through all the tests, then starts over going
through all the tests again and again in a perpetual loop.
That is what you want, to let it loop for many hours... If I
suspected a system might have memory errors I would at least
run it for 24 hours unless the system had only very little,
quite fast memory so it had made more loops than usual.

The important part is that it indicates no errors. If it
does indicate any you need to resolve that before worrying
about hard drives or software, etc. You should actually not
even boot windows if it indicates errors as that can result
in file corruption, though you might already have file
corruption from running OS and especailly doing the defrags,
"IF" the memory is instable.




It tests cached and uncached - all the memory except the
tiny portion occupied by the software. It is almost too
user friendly in that you don't need to do anything but boot
it and wait to see if there are errors, leaving it running
unattended though once you start seeing errors you might as
well stop the test as it is already a sign the memory
instability has to be resolved however necessary, then
repeat the test to confirm the solution worked.

So you do have memory errors... Pull out one memory module
and retest, then test with only the other memory module
installed. It could be that is physically bad, or incapable
of the timings the bios is using, or it could be an
instability from bus loading when both modules are
installed. Sometimes a resolution is manually setting
slower memory timings, or if the memory bus is running at
100MHz, reducing that to 66MHz. I don't recall if you have
a P3 or Celeron processor, but with the Celeron the
reduction to 66MHz memory bus isn't such a penalty except
when it comes to video performance... but ultimately if
performance is very important, switching to more modern
parts is the obvious solution.





Hmm. I should read ahead in posts more often. I suspect it
is either testing a portion of memory that the motherboard
has reserved (someone may correct me about this, I don't
recall what will happen if it tries to do that), or the
motherboard itself is failing.

Are the errors always at the same addresses? Given three
different modules ran by themselves, always having errors at
same addresses would tend to suggest it is a motherboard
reserved memory issue, not actual errors per se - except
there could also be errors not always in the same addresses,
a combination of both reserved memory addressses and actual
instability due to motherboard failure. Without seeing what
it reported, it is a bit hard to speculate. Perhaps the
memtest86+ author could have included helpfiles to better
describe where to go when seeing results in the program but
there are so many variables it may be better left to the one
running the test, where to go next in trying to resolve the
problem.



It is not only testing cache. The complete testing runs
through all tests and repeats, shows a number of "passes",
IIRC, and it would need to run for several hours to find
some errors if a system were barely instable.




No, forget about the idea that it's only testing L1 cache
because it is not. However, for informational purposes, the
L1 and L2 cache is integrated onto the die of the processor
itself. If you had such significant errors in the L1 and L2
cache (which is very unlikely by itself unless the processor
were overclocked), the system wouldn't be running at all.


Memtest86+ is doing exactly what it should, indicating
errors. Give us feedback about what I'd mentioned above,
whether _all_ of these errors are always at the same memory
locations, or since many many errors will be scrolling down
the screen without a way to see them all, at least whether
it seems that they always occur at the exact same addresses.

If they do, I think memtest is testing an area it shouldn't
and you should use the user settings to specify not to test
this memory range, but still running all tests for several
hours. If it can't run without a single error the board may
be dying... but more feedback would help to determine this.

Thanks for all the help.
The errors begin almost immediately, within the first minute of the
testing. That's a fast test that only lasts a couple minutes, which
is why I suspect it's testing the first item on the list which is that
(small) L1 cache. It always gets 20+ errors, I think 23. I will have
to check to see if it's exactly in the same place.

However, there is another problem more serious at the moment. While
my F: drive is the one that got corrupted (2nd drive), the problem is
my first drive (boot drive). It totally died last evening. I was on
the phone in front of the computer, I was shelled to dos and had my
dos database (phone list) on the screen. I was not even touching the
computer, and I heard the hard drives shut off. Then my screen went
bonkers and it froze. I rebooted and that drive would not spin up at
all. I unplugged drive 2, tested the voltage going to the drives, it
seems a little low at 4.4v and 10.6V, but this is a $10 test meter so
I dont know just how accurate it is.

I plugged in drive 2 (no data cable, just the power), and it spun
right up. I removed everything except the drive 1, rebooted, and it
would not spin up at all. I opened my power supply and cleaned the
dust out, and looked for anything that looked bad. Everything looked
normal. Just for the heck of it, I plugged that drive 1 into another
ancient computer (just the power plug), it would not spin up.

I plugged in the 10gig drive that came with this computer, which has
Win2000 on it. I hate win2k, but I never deleted it off that drive
because I got plenty of 10gig drives laying around. Win2k booted
right up. I plugged my drive 2 in along with the Win2k drive and
could read it fine (except the drive letters are wrong because Win2k
insists that my USB devices are supposed to be placed ahead of the 2nd
hard drive.......(even after I unplugged them)........ Now you know
why I hate Win2k).

Anyhow, I was able to read that drive 2 just fine.

Before giving up all hope on that drive 1, I took the circuit board
off of it, and put it back on. This is not the first Quantum Fireball
20gig drive that I have had similar problems with. (and I will never
buy another Quantum drive). I replaced the board, and it still would
not spin up. I ripped the board off a (working) Quantum 10gig drive
and put it in this drive. The drive spun up, but would not read the
data, just kept clicking. I took that original board and washed it in
water, took my air compressor and dried it, banged it around a few
times and sat it in a heat register to completely dry. I put it back
on the drive and that drive is running perfectly right now.

Go figure !!!

Quantum drives have very fine and very short pins that contact the
board, and I think they get dirty and lose connection. I am doing
another backup as I type this, because there were some recent changes
and downloads I wanted to save. This drive is going to be tossed as
soon as I back it up. I am going to attempt to use partition magic
and make a clone, if it runs that long. If not, I have an older clone
and can use that and then install my backup to it. I have too much
data to reinstall windows from scratch, and since my original win98 CD
is scratched and will no longer read, I have the install cabs on this
harddrive. I just copied them to another good drive.

Well, I better go and make that clone before it dies again.

George
 
K

kony

Thanks for all the help.
The errors begin almost immediately, within the first minute of the
testing. That's a fast test that only lasts a couple minutes, which
is why I suspect it's testing the first item on the list which is that
(small) L1 cache. It always gets 20+ errors, I think 23. I will have
to check to see if it's exactly in the same place.

It is very, very rare for the L1 or L2 cache to have errors.
First, unlike main system memory it employs ECC so it should
not return a value at all unless horribly instable - enough
that it wouldn't even be able to decompress and execute the
bios prior to booting the memory test.


However, there is another problem more serious at the moment. While
my F: drive is the one that got corrupted (2nd drive), the problem is
my first drive (boot drive). It totally died last evening. I was on
the phone in front of the computer, I was shelled to dos and had my
dos database (phone list) on the screen. I was not even touching the
computer, and I heard the hard drives shut off. Then my screen went
bonkers and it froze. I rebooted and that drive would not spin up at
all. I unplugged drive 2, tested the voltage going to the drives, it
seems a little low at 4.4v and 10.6V, but this is a $10 test meter so
I dont know just how accurate it is.

Not having your meter I can't say for certain, but generally
even the really cheap ones are accurate enough until they
get quite old. By accurate enough I would expect it to be
within 0.1V at this range, but you can only compare to
another known more accurate meter to be sure.

If a drive has failed, it could be putting a high load on
the PSU connector and cause a low reading, but you ought to
remeasure PSU voltage without it connected and determine if
the meter is accurate because 4.4 and 10.6V is certainly too
low, a sign of a problem if still that low.


I plugged in drive 2 (no data cable, just the power), and it spun
right up. I removed everything except the drive 1, rebooted, and it
would not spin up at all. I opened my power supply and cleaned the
dust out, and looked for anything that looked bad. Everything looked
normal. Just for the heck of it, I plugged that drive 1 into another
ancient computer (just the power plug), it would not spin up.

I plugged in the 10gig drive that came with this computer, which has
Win2000 on it. I hate win2k, but I never deleted it off that drive
because I got plenty of 10gig drives laying around. Win2k booted
right up. I plugged my drive 2 in along with the Win2k drive and
could read it fine (except the drive letters are wrong because Win2k
insists that my USB devices are supposed to be placed ahead of the 2nd
hard drive.......(even after I unplugged them)........ Now you know
why I hate Win2k).

I love win2k, more stable and best Windows ever (before they
added a lot of crap to WinXP). You can go into Computer
Management -> Disk Management, and assign any drive letter
you want, permanently... win2k will let you do that while
win9x forces the DOS ordering and continual shuffling of
letters depending on what's plugged in at any moment.


Anyhow, I was able to read that drive 2 just fine.

Before giving up all hope on that drive 1, I took the circuit board
off of it, and put it back on. This is not the first Quantum Fireball
20gig drive that I have had similar problems with. (and I will never
buy another Quantum drive). I replaced the board, and it still would
not spin up. I ripped the board off a (working) Quantum 10gig drive
and put it in this drive. The drive spun up, but would not read the
data, just kept clicking. I took that original board and washed it in
water, took my air compressor and dried it, banged it around a few
times and sat it in a heat register to completely dry. I put it back
on the drive and that drive is running perfectly right now.

Go figure !!!

That was a stroke of luck... but I doubt I'd trust that
drive for much more than a paperweight at this point, but at
least you have a chance to get any data off if you hadn't
already. Regardless, a 20GB drive is quite old at this
point, it's ripe for failure even by another cause at this
age.


Quantum drives have very fine and very short pins that contact the
board, and I think they get dirty and lose connection. I am doing
another backup as I type this, because there were some recent changes
and downloads I wanted to save. This drive is going to be tossed as
soon as I back it up. I am going to attempt to use partition magic
and make a clone, if it runs that long. If not, I have an older clone
and can use that and then install my backup to it. I have too much
data to reinstall windows from scratch, and since my original win98 CD
is scratched and will no longer read, I have the install cabs on this
harddrive. I just copied them to another good drive.

Well, I better go and make that clone before it dies again.

George

The odd thing is, lack of circuit board pin contact
shouldn't pull down the power to the 4.4/10.6V values you
saw. Something else still seems wrong, but maybe it's just
the meter calibration.
 
T

thanatoid

<SNIP>

WHY do people who obviously know what they're talking about
(like you) post so seldom and morons (like myself and others)
post a lot?

Sigh.
 
G

georget

It is very, very rare for the L1 or L2 cache to have errors.
First, unlike main system memory it employs ECC so it should
not return a value at all unless horribly instable - enough
that it wouldn't even be able to decompress and execute the
bios prior to booting the memory test.




Not having your meter I can't say for certain, but generally
even the really cheap ones are accurate enough until they
get quite old. By accurate enough I would expect it to be
within 0.1V at this range, but you can only compare to
another known more accurate meter to be sure.

If a drive has failed, it could be putting a high load on
the PSU connector and cause a low reading, but you ought to
remeasure PSU voltage without it connected and determine if
the meter is accurate because 4.4 and 10.6V is certainly too
low, a sign of a problem if still that low.




I love win2k, more stable and best Windows ever (before they
added a lot of crap to WinXP). You can go into Computer
Management -> Disk Management, and assign any drive letter
you want, permanently... win2k will let you do that while
win9x forces the DOS ordering and continual shuffling of
letters depending on what's plugged in at any moment.




That was a stroke of luck... but I doubt I'd trust that
drive for much more than a paperweight at this point, but at
least you have a chance to get any data off if you hadn't
already. Regardless, a 20GB drive is quite old at this
point, it's ripe for failure even by another cause at this
age.




The odd thing is, lack of circuit board pin contact
shouldn't pull down the power to the 4.4/10.6V values you
saw. Something else still seems wrong, but maybe it's just
the meter calibration.

Thanks for all the help.
I am going to do some more testing but at the moment this computer is
working good after reinstalling a new hard drive. That was a chore,
fo some reason the 80gig drive would not cooperate. Partition magic
seemed to keep freezing up. Thisa used drive had some oddball format
on it, and I could not get rid of it. I finally used Fdisk and
removed everything, but fdisk dont recognise drives this size
properly. First I tried to clone my boot drive to a 15 gig drive I
had in my junk box (another quantum), and after 3 hours of fighting
with it, I went to a local used computer store and bought this Western
Digital 80gig. I have always liked Western Digital. I have never had
any problems with them. On the other hand, every Quantum I ever used
was a POS.

As for the voltages, I will have to borrow another meter. This one I
have is brand new, but it's a cheapie Walmart one. I mostly just
bought it to test automotive wiring. Burt with the new harddrive I am
reading 4.9v and 11.6v. That seems reasonable (I think).

I agree that the author of Memtest86 should offer more help files.
I am going to run it again now that the drives are fixed. I dont know
what I'm going to do with all this extra drive space now :)

Well, thats the latest.

By the way, that defective drive continued to run fine till I finally
removed it. I'm keeping it for the moment, just in case this was to
fail, it's still a backup (maybe). Sometimes beating the crap out of
computer parts does work <LOL>.

Thanks

George
 
K

kony

<SNIP>

WHY do people who obviously know what they're talking about
(like you) post so seldom and morons (like myself and others)
post a lot?

Sigh.


You're underestimating yourself.

There's bound to be a lot of things you know that I don't.
Just spread info, that's what separates us from apes.
 
K

kony

Thanks for all the help.
I am going to do some more testing but at the moment this computer is
working good after reinstalling a new hard drive. That was a chore,
fo some reason the 80gig drive would not cooperate. Partition magic
seemed to keep freezing up. Thisa used drive had some oddball format
on it, and I could not get rid of it. I finally used Fdisk and
removed everything, but fdisk dont recognise drives this size
properly. First I tried to clone my boot drive to a 15 gig drive I
had in my junk box (another quantum), and after 3 hours of fighting
with it, I went to a local used computer store and bought this Western
Digital 80gig. I have always liked Western Digital. I have never had
any problems with them. On the other hand, every Quantum I ever used
was a POS.

As for the voltages, I will have to borrow another meter. This one I
have is brand new, but it's a cheapie Walmart one. I mostly just
bought it to test automotive wiring. Burt with the new harddrive I am
reading 4.9v and 11.6v. That seems reasonable (I think).

The 11.6V value could be cause for concern, but I have seen
systems that ran ok when their 12V rail was weak and it
dropped that low. Further we still don't know if your meter
is accurate, if it is inaccurate then the higher the voltage
on the same scale, the more off it may be.

On the other hand, I'm not necessarily thinking the meter is
off this much, I have had cheap meters that I'd just as soon
throw away as buy a new battery for, but their accuracy
wasn't very bad at that point. Voltage readings on a 0-~20V
scale are usually something even the cheapest of meters can
do well enough for the precision we'd need checking a PSU.


I agree that the author of Memtest86 should offer more help files.
I am going to run it again now that the drives are fixed. I dont know
what I'm going to do with all this extra drive space now :)

Fortunately, beyond drive space it should have much higher
read/write performance, and a few years of use ahead of it.

Well, thats the latest.

By the way, that defective drive continued to run fine till I finally
removed it. I'm keeping it for the moment, just in case this was to
fail, it's still a backup (maybe). Sometimes beating the crap out of
computer parts does work <LOL>.


Keep in mind that different drives may have different
tolerance for low voltage, "IF" your PSU is undervolting the
12V rail, having one drive work while another doesn't isn't
necessarily a sign that the drive is the problem, BUT, if it
keeps working ok that is proof enough, yet I would still try
to confirm the low 12V rail voltage or reject the PSU if it
can't maintain higher than that as on a legacy system there
should be minimal loading on the 12V rail, more often the
12V rail value would read a little over 12.0V volts rather
than under that value.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top