Utility to test IDE cable connections?

D

David R

Are there any (hardware or software) utilities which will test if all the
pins in the 40-way connector on an IDE cable have all made proper contact
with the socket on the hard drive and/or with the socket on the
motherboard?

I recently had a strange fault which was due to one of the 40 IDE
connections being inadequate. Now, I would like to be able to actually
test for such a thing.
 
T

Tx2

Are there any (hardware or software) utilities which will test if all the
pins in the 40-way connector on an IDE cable have all made proper contact
with the socket on the hard drive and/or with the socket on the
motherboard?

I recently had a strange fault which was due to one of the 40 IDE
connections being inadequate. Now, I would like to be able to actually
test for such a thing.


Wouldn't it be easier to simply hook up a replacement cable, if it
works, the previous cable is duff ....
 
S

Synapse Syndrome

David R said:
Are there any (hardware or software) utilities which will test if all the
pins in the 40-way connector on an IDE cable have all made proper contact
with the socket on the hard drive and/or with the socket on the
motherboard?

I recently had a strange fault which was due to one of the 40 IDE
connections being inadequate. Now, I would like to be able to actually
test for such a thing.


If you are deeply serious about wanting to test your cables you may want to
invest in this

http://www.abccables.com/258895.html

ss.
 
N

no66y©

"Tx2" wrote in message
I still prefer my method, bung another cable on and see
what happens...

Yep, thats the method I use also :)


--
No66y©
Those who find they're touched by madness
Sit down next to me

Reply to address is a spam trap.
Use no66y [at] breathe [dot] com
 
D

David R

Are there any (hardware or software) utilities which will

Tx2 said:
Wouldn't it be easier to simply hook up a replacement cable,
if it works, the previous cable is duff ....


The "duff" cable worked. But it was giving errors. XP seemed to detect
them but they did not seem to be critical in the sense of preventing data
being stored.

However if I could not sense the cable was "duff" and the errors were
indeed critical then I would be in trouble. This is what I want to avoid.
 
D

David R

JAD said:
error being unable to enable DMA?


I was in fact able to enable DMA.

The hard drive seemed to work well but I was getting this written into XP's
Event Log on every single attempt to write a file to that drive.




-------- QUOTE --------

Event Type: Warning
Event Source: Disk
Event Category: None
Event ID: 51
Date: 07/08/2004
Time: 11:38:20
User: N/A
Computer: X
Description:
An error was detected on device \Device\Harddisk0\D during a paging
operation.


Data:
0000: 04 01 68 00 01 00 b6 00 ..h...¶.
0008: 00 00 00 00 33 00 04 80 ....3..?
0010: 2d 01 00 00 00 00 00 00 -.......
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 50 0a 0a 00 00 00 00 00 P.......
0030: ff ff ff ff 03 00 00 00 ÿÿÿÿ....
0038: 40 00 00 4f 22 00 00 00 @..O"...
0040: ff 20 0a 12 8c 02 00 40 ÿ ..?..@
0048: 00 4c 00 00 0a 00 00 00 .L......
0050: 00 00 00 00 a8 3f b7 82 ....¨?·?
0058: 00 00 00 00 b8 3b 89 82 ....¸;??
0060: 88 b1 f3 82 37 a5 01 00 ?±ó?7¥..
0068: 2a 00 00 01 a5 37 00 00 *...¥7..
0070: 26 00 00 00 00 00 00 00 &.......
0078: 00 00 00 00 00 00 00 00 ........
0080: 00 00 00 00 00 00 00 00 ........
0088: 00 00 00 00 00 00 00 00 ........

-------- END QUOTE --------
 
C

CBFalconer

David said:
.... snip ...

The "duff" cable worked. But it was giving errors. XP seemed
to detect them but they did not seem to be critical in the sense
of preventing data being stored.

However if I could not sense the cable was "duff" and the errors
were indeed critical then I would be in trouble. This is what I
want to avoid.

Normally the HD system stores and recovers data using various
error detecting protocols, and will do a retry if the action
fails. This doesn't generally cover the transmission over those
cables, however, but any faults there should be fairly gross and
are not likely to go undetected.

However, the memory in your system, if not protected with ECC, is
another matter. There an error will be totally ignored. Here is
something I wrote about two years ago:

A Tale of Two Machines - by Charles Falconer :)
or What the Dickens
============================================

Your system is updating or moving a file. This may be the
operation of a database application, disk defragmentation, the
opsystem updating the last accessed date, or almost anything. At
some point in the operation the data is residing in memory. Here
comes a cosmic ray which trashes some bit. The result is written
out to storage as a valid value.

Nothing special happens. One week, month, year later you access
that data, or use the executable file that was fouled, and
possibly something obvious happens. Or not, it may just result in
trashing some further dependant data, and the obvious fault gets
postponed further.

In the interim you have dutifully made backups. By now the
backups you made before that cosmic ray happened are long gone,
overwritten, and probably pretty useless even if you still have
them, because most of the data on them is obsolete.

So you restore everything, including that fouled file or files.
Maybe the fault shows up again, and you start swearing at the
hardware, software, wife, dog, whatever. Maybe it waits around
for another period before showing up. But it is lurking there,
waiting to bite at the worst possible time (Murphy ensures this).

Still no sign of hardware troubles. The memory checks out
perfectly (unless you get a suitable cosmic ray during the
check). Backups still do no good. Neither does cursing.

How many days or weeks have you now lost? Your customer has long
gone elsewhere. How many irate calls to some ignoramus on some
help desk have you placed, and at what cost? You may well resolve
it by a full reinstall, but if the fault is in your own data that
won't help either. If you go and buy a new machine and install
those backed up faulty data files the error follows right along
like a tame puppy dog.

Or, another scenario, the dropped bit changes an accounting
value. The resulting reports are off a few dollars (or more, ever
hear of someone getting a pay check cut for an extra million?).
Your customers curse you for flakey service, and go elsewhere.
Maybe the IRS gets snitty about something that doesn't balance,
and attaches your whole business. Murphy carries on.

Here comes the second machine. Now consider the system with ECC
memory installed, enabled, and functioning. It probably slowed
down by some fraction of one percent. Did you notice? It
probably cost you twenty to a hundred US dollars extra. Did you
really notice?

However, the ECC memory system noticed the cosmic ray effect, and
corrected it immediately. You certainly didn't notice that. But
neither did you notice all the other potential problems that could
appear on the non-ECC machine.

Of course you COULD get along without the ECC and be lucky. You
COULD indulge in unsafe sex with some stranger and be lucky. You
COULD ignore that red light and be lucky. At least the cause and
effect are obvious in the red light case.

My recommendation: ALWAYS insist on ECC memory.
 
L

Lil' Dave

While I agree with another post concerning physical memory (RAM) being a
component of I/O can be a problem, there is nothing mystical about it. IE -
cosmic ray. Nor do I agree that ECC should be used for an at home desktop.
Stick with quality memory from Crucial for instance. Don't mix/match memory
either.

Another part of the I/O is the delivery cable for ide. Stick with 80 wire
versions, 18 inches long, ribbon type. Use default master/slave, where
master is at end and slave in middle.
 
C

CJT

Lil' Dave said:
While I agree with another post concerning physical memory (RAM) being a
component of I/O can be a problem, there is nothing mystical about it. IE -
cosmic ray. Nor do I agree that ECC should be used for an at home desktop.
Stick with quality memory from Crucial for instance. Don't mix/match memory
either.

You seem to be mixing apples and oranges (type of memory and
manufacturer). Crucial sells ECC.
 
A

Alex Fraser

CJT said:
You seem to be mixing apples and oranges (type of memory and
manufacturer). Crucial sells ECC.

You seem to be missing the point ;). I think the point is that ECC is not
necessary for desktop systems if you use good quality non-ECC memory, such
as Crucial.

The fact that the majority of chipsets on motherboards used for desktop
systems can't take advantage of ECC makes it largely a moot point anyway.

Alex
 
M

Michael Salem

Lil' Dave said:
While I agree with another post concerning physical memory (RAM) being a
component of I/O can be a problem, there is nothing mystical about it. IE -
cosmic ray.

What's "mystical" about cosmic rays? They do reach earth. Cosmic rays
and local radioactive decay can and do do cause computer memory errors
(IBM Journal of Research and Development, Volume 40, Number 1).

A test made by IBM on a 4Mbit DRAM found a soft error rate of about 6000
in a billion chip hours. A similar test in a vault under 20 tons of rock
produced no errors.
Nor do I agree that ECC should be used for an at home desktop.

Is there any reason not to use ECC besides some cost and a very small
loss of performance?

I suppose this comes down to what a "home computer" is. Some may be used
to play games and write letters; others may archive a lifetime's worth
of work.

If not ECC memory, there is advantage to using parity-checked memory; a
memory error should cause the computer to halt with a warning, rather
than corrupting files.

Best wishes,
 
J

J. Clarke

Lil' Dave said:
While I agree with another post concerning physical memory (RAM) being a
component of I/O can be a problem, there is nothing mystical about it. IE
-
cosmic ray. Nor do I agree that ECC should be used for an at home
desktop.
Stick with quality memory from Crucial for instance. Don't mix/match
memory either.

The incremental cost of ECC is minuscule and the performance penalty equally
so. The only reason _not_ to use it is the paucity of boards that support
it (three Bronx cheers for Intel).
 
D

David Maynard

Michael said:
Lil' Dave wrote:




What's "mystical" about cosmic rays? They do reach earth.

I think he means that the odds of the stated symptom being caused by cosmic
rays is much less than more conventional sources. So much less that it
falls into the 'mystical' category.
Cosmic rays
and local radioactive decay can and do do cause computer memory errors
(IBM Journal of Research and Development, Volume 40, Number 1).

A test made by IBM on a 4Mbit DRAM found a soft error rate of about 6000
in a billion chip hours. A similar test in a vault under 20 tons of rock
produced no errors.

That rate comes to 1 per 19 years, assuming 24/7 operation. Hey, if you're
lucky it was off when that one came through ;) Or you were doing any of the
90% of the time non critical things people normally use a home PC for.
Is there any reason not to use ECC besides some cost and a very small
loss of performance?

Two good reasons.
I suppose this comes down to what a "home computer" is. Some may be used
to play games and write letters; others may archive a lifetime's worth
of work.

It comes down to more than that. The odds of it happening and the
consequences if it does (which is an entire probability set of it's own) vs
the cost of taking preventative measures.

Once in 19 years is a rather rare event and even if it happened that
doesn't mean you automatically lose 'important' data. It would have to
occur at a particular time that affected a particular thing in a particular
manner. Ok, so maybe I lost a 'pixel' in a picture of pooch or it blew a
character in of those wonderful SPAM emails that come with garbled text to
begin with. Odds are the real impact [pun intended] would be 'erp',
unexplained program error, a few curse words about 'microsoft software',
and restart [as if THAT never happens even without the help of cosmic rays].

For the typical home user, the odds of losing EVERY thing from a hard drive
failure, combined with the traditionally lousy backup regimen, or some
other failure that causes the system to go 'nuts' is much, much, higher
than worrying about cosmic rays. The odds are higher it'll get bumped at an
inopportune time, or that a component will fail, or that a connector will
work lose from thermal creep, or any number of things. Hell, the odds of
the user screwing his data up HIMSELF is a thousand times higher.

And we didn't even touch on getting a virus.
If not ECC memory, there is advantage to using parity-checked memory; a
memory error should cause the computer to halt with a warning, rather
than corrupting files.

I agree, if one is using it to calculate warp drive trajectories and an
'oops' may put you inside a sun somewhere. But then I'd be recommending
multiple redundant systems too.
 
M

Michael Salem

David said:
I think he means that the odds of the stated symptom being caused by cosmic
rays is much less than more conventional sources. So much less that it
falls into the 'mystical' category.

As I understand it, from checking references, 98% of memory errors
(presumably in tested, good-quality, memory, running in quality systems
with clean power) are soft errors; virtually all soft errors are due to
either cosmic radiation or background radioactivity (e.g., atoms of
thorium from our weakly polluted environment embedded in the casing of
RAM chips).

I believe that power supply problems can also cause RAM corruption,
which ECC would protect against.
That rate comes to 1 per 19 years, assuming 24/7 operation. Hey, if you're
lucky it was off when that one came through ;) Or you were doing any of the
90% of the time non critical things people normally use a home PC for.

I expect this depends very strongly upon the particular chips, and the
environment. However, if I unjustifiably extrapolate from the figures I
quote, we get roughly 10^^9/5000 errors per 4MB chip hour, or 250 times
this per 1GB chip hour.
Is there any reason not to use ECC besides some cost and a very small
loss of performance?

Two good reasons.
I suppose this comes down to what a "home computer" is. Some may be used
to play games and write letters; others may archive a lifetime's worth
of work.

It comes down to more than that. The odds of it happening and the
consequences if it does (which is an entire probability set of it's own) vs
the cost of taking preventative measures.

Once in 19 years is a rather rare event and even if it happened that
doesn't mean you automatically lose 'important' data. It would have to
occur at a particular time that affected a particular thing in a particular
manner. Ok, so maybe I lost a 'pixel' in a picture of pooch or it blew a
character in of those wonderful SPAM emails that come with garbled text to
begin with. Odds are the real impact [pun intended] would be 'erp',
unexplained program error, a few curse words about 'microsoft software',
and restart [as if THAT never happens even without the help of cosmic rays].

For the typical home user, the odds of losing EVERY thing from a hard drive
failure, combined with the traditionally lousy backup regimen, or some
other failure that causes the system to go 'nuts' is much, much, higher
than worrying about cosmic rays. The odds are higher it'll get bumped at an
inopportune time, or that a component will fail, or that a connector will
work lose from thermal creep, or any number of things. Hell, the odds of
the user screwing his data up HIMSELF is a thousand times higher.

And we didn't even touch on getting a virus.
If not ECC memory, there is advantage to using parity-checked memory; a
memory error should cause the computer to halt with a warning, rather
than corrupting files.

I agree, if one is using it to calculate warp drive trajectories and an
'oops' may put you inside a sun somewhere. But then I'd be recommending
multiple redundant systems too.

Ultimately we must balance the product of estimated probability of soft
errors by its consequence; against the cost and speed loss.

Personally I have things I very much hate to lose on my hard disc, so
recently chose to buy a motherboard and RAM supporting ECC for maybe an
easily affordable extra 50 GBP (to be multiplied by future similar
purchases). I didn't do this when ECC was very expensive, though I did
prefer parity RAM if possible.

As you say, a bit error is likely to be trivial (I did once have a
system with a hard error which invariably corrupted a single letter
every time I edited and saved a document!) But it can conceivably cause
big trouble.

I don't want to argue for its own sake, and I don't think there is a
"correct" decision. I am quite happy to be called paranoid (=, in some
cases, survivor). I ask for further reasons not to use ECC in case I
need to re-evaluate my policy.

Most people dealing with file servers use ECC, presumably from the same
possibly excessive caution as me.

I consider some of the data on my own hard disc as more valuable than
what is on many servers (though it is rarely rewritten, and more
susceptible to hard disc than RAM corruption).

Best wishes,
 
C

CBFalconer

Alex said:
You seem to be missing the point ;). I think the point is that
ECC is not necessary for desktop systems if you use good quality
non-ECC memory, such as Crucial.

The fact that the majority of chipsets on motherboards used for
desktop systems can't take advantage of ECC makes it largely a
moot point anyway.

If people simply refuse to buy systems without ECC memory that
sillyness will disappear. The point is that non-ECC memory
systems are completely vulnerable to such things as Cosmic Rays,
without any immediate warnings, and that a complete cure is
available for very moderate cost. The cost is probably negative
if you include the diagnosis time for other faults.
 
B

Bob Day

CBFalconer said:
If people simply refuse to buy systems without ECC memory that
sillyness will disappear. The point is that non-ECC memory
systems are completely vulnerable to such things as Cosmic Rays,

Yes. A white paper I've recently found that was published last
January indicates that a PC with 512MB of memory running
24 hours a day will sustain a memory error on an average of
about every 10 days. See:
http://www.tezzaron.com/about/papers/Soft Errors 1_1 secure.pdf ,
Appendix B, Calculations, on page 6.

-- Bob Day
http://bobday.vze.com
 
D

David Maynard

Michael said:
I wrote:


David Maynard wrote:




As I understand it, from checking references, 98% of memory errors
(presumably in tested, good-quality, memory, running in quality systems
with clean power) are soft errors;

Well, they would be. Soft error simply means an error occurred but the
memory has not failed, or else it's a hard error.
virtually all soft errors are due to
either cosmic radiation or background radioactivity (e.g., atoms of
thorium from our weakly polluted environment embedded in the casing of
RAM chips).

I believe that power supply problems can also cause RAM corruption,
which ECC would protect against.

I'm not sure. PSU induced errors may be too large for ECC to fix.

I expect this depends very strongly upon the particular chips, and the
environment. However, if I unjustifiably extrapolate from the figures I
quote, we get roughly 10^^9/5000 errors per 4MB chip hour, or 250 times
this per 1GB chip hour.

Yes. Soft error rates for one type of RAM doesn't necessarily extrapolate
directly to either chip-hour-MB and different chip technologies/density.

As a case in point, these folks say they've licked it (almost) entirely,
for SRAMs anyway, using, amusingly enough, DRAM technology.

http://neasia.nikkeibp.com/wcs/leaf/CID/onair/asabt/news/315057

Since you were asking about it I did a quick google for some more recent
numbers on DRAM and found this rather interesting article, which includes
some analysis on the things I mentioned, like 'what is it doing when' and
whether it's fatal, etc.

http://www.eecg.toronto.edu/~lie/papers/hp-softerrors-ieeetocs.pdf

It's interesting to note that early on they mention an error rate for
'modern' 64MB ram chips in a 1 gig memory and while the "300
reboots resulting from soft errors on 10000 machines in 1 year" sounds like
a large number, because they're looking at how 'the industry' is affected,
it translates to 1 per 33 years for a 'user' on his one machine.
(unfortunately they don't make it clear if that's errors that get THROUGH
the ECC or if they're using the ECC to count the errors [my assumption])

Btw, CPUs have the same susceptibility.

Nor do I agree that ECC should be used for an at home desktop.

Is there any reason not to use ECC besides some cost and a very small
loss of performance?

Two good reasons.

I suppose this comes down to what a "home computer" is. Some may be used
to play games and write letters; others may archive a lifetime's worth
of work.

It comes down to more than that. The odds of it happening and the
consequences if it does (which is an entire probability set of it's own) vs
the cost of taking preventative measures.

Once in 19 years is a rather rare event and even if it happened that
doesn't mean you automatically lose 'important' data. It would have to
occur at a particular time that affected a particular thing in a particular
manner. Ok, so maybe I lost a 'pixel' in a picture of pooch or it blew a
character in of those wonderful SPAM emails that come with garbled text to
begin with. Odds are the real impact [pun intended] would be 'erp',
unexplained program error, a few curse words about 'microsoft software',
and restart [as if THAT never happens even without the help of cosmic rays].

For the typical home user, the odds of losing EVERY thing from a hard drive
failure, combined with the traditionally lousy backup regimen, or some
other failure that causes the system to go 'nuts' is much, much, higher
than worrying about cosmic rays. The odds are higher it'll get bumped at an
inopportune time, or that a component will fail, or that a connector will
work lose from thermal creep, or any number of things. Hell, the odds of
the user screwing his data up HIMSELF is a thousand times higher.

And we didn't even touch on getting a virus.

If not ECC memory, there is advantage to using parity-checked memory; a
memory error should cause the computer to halt with a warning, rather
than corrupting files.

I agree, if one is using it to calculate warp drive trajectories and an
'oops' may put you inside a sun somewhere. But then I'd be recommending
multiple redundant systems too.


Ultimately we must balance the product of estimated probability of soft
errors by its consequence; against the cost and speed loss.

Personally I have things I very much hate to lose on my hard disc, so
recently chose to buy a motherboard and RAM supporting ECC for maybe an
easily affordable extra 50 GBP (to be multiplied by future similar
purchases). I didn't do this when ECC was very expensive, though I did
prefer parity RAM if possible.

As you say, a bit error is likely to be trivial (I did once have a
system with a hard error which invariably corrupted a single letter
every time I edited and saved a document!) But it can conceivably cause
big trouble.

I don't want to argue for its own sake, and I don't think there is a
"correct" decision. I am quite happy to be called paranoid (=, in some
cases, survivor). I ask for further reasons not to use ECC in case I
need to re-evaluate my policy.

I don't have a reason 'not to' other than the ones already mentioned. I
just thought the blanket assertion of 'cosmic rays, oh my' was overstated,
especially when one considers all the other things that are more probable
and likely to cause even worse problems.

Most people dealing with file servers use ECC, presumably from the same
possibly excessive caution as me.

Well, they have a higher risk exposure.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top