recommended drives?

R

Rod Speed

J. Clarke said:
kinetic wrote
Modern disks have spare sectors and map bad ones to spares
automatically. This means that you never see a bad sector until the
disk is in really bad shape, so it seems to fail without warning.
SMART is intended to deal with this by providing you a warning when
the disk is starting to show signs of impending failure--it is built
into the disk and monitors a bunch of indicators, including the
percentage of the spare sectors that have been used--when it tells
you that a disk is on the way out, it's not a guarantee of impending
failure--the disk may run in its current condition for a very long
time--but you were warned.
That message is actually from your computer's firmware--SMART doesn't
say "this drive is about to go", it just reports a bunch of numbers
that if you have turned on the capability get checked during the POST
routine and if the combination is such that your computer
manufacturer's programmers have decided that they indicate that the
disk is having problems the machine will give you the error message
you see. If it came up once and not again it's probably a
fluke--something went wrong during power up that one time.

That shouldnt happen with properly implemented SMART,
it should be able to work out that it is a one off and not
bring up a warning that the drive is about to die.
 
F

Folkert Rienstra

Oh, here we go again. That stupid troll keeps doing it.

J. Clarke said:
Modern disks have spare sectors and map bad ones to spares automatically.

No they don't. They do it automatically on sectors *that are about to fail*
but don't yet fail completely.
This means that you never see a bad sector until the disk is in really bad shape,

Wrong again.
Uncorrectable read error bad sectors are *never* replaced automatically.
It takes a write to such sector.
The Clarke troll knows it because he has experienced that himself and is even
on record for it in Google.
so it seems to fail without warning.

SMART is intended to deal with this by providing you a warning when the disk
is starting to show signs of impending failure--it is built into the disk
and monitors a bunch of indicators, including the percentage of the spare
sectors that have been used--when it tells you that a disk is on the way
out, it's not a guarantee of impending failure--the disk may run in its
current condition for a very long time--but you were warned.

That message is actually from your computer's firmware--

or any SMART monitoring utility.
SMART doesn't say "this drive is about to go",

Wrong again. It *does* indicate that by providing a status :

The SMART ATA drive specification allows up to 30
internal drive measurements. These are termed failure
attributes and are periodically measured by a drive.
Attribute values are stored in the drive reserved data area
with other drive operational parameters. For a drive user to
receive a SMART warning the computer system must issue
specific drive interface commands to enable the algorithm
and then to read the resultant

* “won’t-fail/will-fail” * warning

Maximum thresholds are defined for each attribute by the
drive manufacturer.

The *SMART warning flag* is set in response to an
ATA SMART * “Return Status” * command,

if any attribute exceeds its threshold. This is a logical ‘OR’
operation among the several attribute threshold tests, and is
used because some drive failures may be predicted by only
one attribute.

*Threshold Exceeded Condition*
If one or more attribute values, whose Pre-failure bit of their
status flag is set, are less than or equal to their corresponding
attribute thresholds, then the device reliability status is *negative*,
indicating an *impending degrading or faulty condition*.

*SMART Return Status*
If the device does not detect a Threshold Exceeded Condition,
the device loads 4Fh into the Cylinder Low register and C2h
into the Cylinder High register.
If the device detects a Threshold Exceeded Condition, the
device loads F4h into the Cylinder Low register and 2Ch
into the Cylinder High register.
it just reports a bunch of numbers that if you have turned on the
capability get checked during the POST routine and if the combination
is such that your computer manufacturer's programmers have decided
that they indicate that the disk is having problems the machine will give
you the error message you see.

Wrong again.
It just translates the SMART Return Status into a user understandable message.
If it came up once and not again it's probably a fluke--
Nonsense.

something went wrong during power up that one time.

More likely an attribute counter that had crossed the treshold at that time,
to return into safe territory again later. If counters only counted backwards
there obviously wouldn't be a reason for "current" and "worst" value counters.
If it's happening most times that you power up then I'd replace the drive
at my earliest convenience. If you haven't shut the machine down and
rebooted since you saw it you should and see if it comes up again

Or use a S.M.A.R.T. monitoring app and do a 'Read Status'.
 
A

Arno Wagner

Previously J. Clarke said:
kinetic wrote:
Modern disks have spare sectors and map bad ones to spares
automatically. This means that you never see a bad sector until the
disk is in really bad shape, so it seems to fail without warning.
SMART is intended to deal with this by providing you a warning when
the disk is starting to show signs of impending failure--it is built
into the disk and monitors a bunch of indicators, including the
percentage of the spare sectors that have been used--when it tells
you that a disk is on the way out, it's not a guarantee of impending
failure--the disk may run in its current condition for a very long
time--but you were warned.
That message is actually from your computer's firmware--SMART
doesn't say "this drive is about to go",

Actually it does. If one of the pre-fail attributes falls below
the threshold (thresholds are given by the disk), then this means
the drive is about to go and the drive will report a bad SMART status.
it just reports a bunch of numbers that if you have turned on the
capability get checked during the POST routine and if the
combination is such that your computer manufacturer's programmers
have decided that they indicate that the disk is having problems the
machine will give you the error message you see.

It should be the disk itself that claimed it had bad SMART status,
not the BIOS deciding this. I have seen disks that reported bad smart
status and had pre-fail attributes below their thresholds but
"recoverd". Most notably a Maxtor drives did this to me. I mounted
it in an allways-on machine and tortured it (whith kernel compiles) until
it had a stable bad SMART status.
If it came up once and not again it's probably a fluke--something
went wrong during power up that one time. If it's happening most
times that you power up then I'd replace the drive at my earliest
convenience. If you haven't shut the machine down and rebooted
since you saw it you should and see if it comes up again--if it
does, assume that the disk is about to go.

The SMART thresholds on the disks I have used so far were all
optimistic, sometimes very optimistic. I don't think that a
few reallocated sectors are a reason to panic, could have been
a power problem, e.g., but a bad SMART status is a red light
that should not be ignored.

Arno
 
K

kinetic

Rod said:
Usually the warning is accurate. Post the SMART data
report using Everest, its usually obvious from that more
detailled report what the problem with the drive is.
http://www.lavalys.com/products/overview.php?pid=1&lang=en


Normally not a good idea. If the SMART data
indicates that the drive is dying, it usually will die.

This is what I got from Everest. I guess she's going to the dump, huh?

~ Jamie West

ID Attribute Description Threshold Value Worst Data Status
01 Raw Read Error Rate 60 54 54 490736906 Pre-Failure: Imminent loss of
data is being predicted
 
J

J. Clarke

Arno said:
Actually it does. If one of the pre-fail attributes falls below
the threshold (thresholds are given by the disk), then this means
the drive is about to go and the drive will report a bad SMART status.

??? When did the drives start reporting "bad smart status"? Last I heard
they reported raw numbers which were interpreted manually or by software
external to the drive. If that software reports a problem once and then on
retesting does not report the same problem, that would indicate a transient
of some kind, and not a problem inherent to the drive.

To take an example, I had five drives that according to the manufacturer's
diagnostics were all terminal. I also had a bad power supply that was
intermittently dropping the voltages out of spec. Replaced the power
supply, reran the tests, drives reported within acceptable limits and three
years later they're all fine.
It should be the disk itself that claimed it had bad SMART status,
not the BIOS deciding this. I have seen disks that reported bad smart
status and had pre-fail attributes below their thresholds but
"recoverd". Most notably a Maxtor drives did this to me. I mounted
it in an allways-on machine and tortured it (whith kernel compiles) until
it had a stable bad SMART status.

What specific field of the SMART data contains this "bad status"?
The SMART thresholds on the disks I have used so far were all
optimistic, sometimes very optimistic. I don't think that a
few reallocated sectors are a reason to panic, could have been
a power problem, e.g., but a bad SMART status is a red light
that should not be ignored.

If your first reaction on finding a system that reports a SMART issue is to
replace the drive then you might replace ten drives in that system before
whatever is causing it fails and you fix the real problem.
 
J

J. Clarke

kinetic said:
This is what I got from Everest. I guess she's going to the dump, huh?

~ Jamie West

ID Attribute Description Threshold Value Worst Data Status
01 Raw Read Error Rate 60 54 54 490736906 Pre-Failure: Imminent loss of
data is being predicted

The question is why it is happening. Is the drive itself bad or is
something external to the drive preventing it from operating properly?
 
A

Arno Wagner

Previously J. Clarke said:
Arno Wagner wrote:
??? When did the drives start reporting "bad smart status"? Last I heard
they reported raw numbers which were interpreted manually or by software
external to the drive. If that software reports a problem once and then on
retesting does not report the same problem, that would indicate a transient
of some kind, and not a problem inherent to the drive.
To take an example, I had five drives that according to the manufacturer's
diagnostics were all terminal. I also had a bad power supply that was
intermittently dropping the voltages out of spec. Replaced the power
supply, reran the tests, drives reported within acceptable limits and three
years later they're all fine.

O.k. a massive environmental problem can cause this. Basically only
the PSU can.
What specific field of the SMART data contains this "bad status"?

I am not sure whether it is an explicit field or merely one of the
pre-fail attributes being below the threshold. Since the thresholds
are exported by the drive (!), this is equivalent. The drive gives
you also its "worst ever" value for each attribute. At least
if logging is enabled on the drive.
If your first reaction on finding a system that reports a SMART
issue is to replace the drive then you might replace ten drives in
that system before whatever is causing it fails and you fix the real
problem.

Not a SMART issue. A failed status. An issue is for example a
pre-fail attribute that has dropped 10 points. That is not a reason
to replace the drive (not ecessarily that is). But an attribute
reaching its threshold is in my experience indication of a massive
problem.

I admit your PSU example is convincing, but unless you diagnose
something this severe in the disk environment, you should take
SMART failures seriously.

Arno
 
F

Folkert Rienstra

Arno Wagner said:
[snip]
??? When did the drives start reporting "bad smart status"?

T13 offline? Your Acrobat Reader broken?
*Last I heard*

Such imagination.

But still tells people to ditch their drives on first sight of a bad sector showing.
O.k. a massive environmental problem can cause this.
Basically only the PSU can.

Clueless, as always.
Overtemperature can do this too. Bad firmware can cause it too.

Are you another Ostrich? Related to the Wagner family of Ostriches?
I am not sure whether it is an explicit field or merely one of the
pre-fail attributes being below the threshold. Since the thresholds
are exported by the drive (!), this is equivalent. The drive gives
you also its "worst ever" value for each attribute.
At least if logging is enabled on the drive.

Care to give a reference for that? Because this is almost saying
that SMART values are only preserved if logging is enabled.
If logging is not enabled then SMART values are only valid and counting
from power-up and be pretty useless. Surely that can not be the case.

Odd.
Earlier this same troll said to immediately exchange a drive at the first
sight of a bad sector showing as this would be the sure indication of the
drive having exhausted it's spare sectors. Now it says the opposite.
Not a SMART issue.
A failed status.

Which is set by S.M.A.R.T.
An issue is for example a pre-fail attribute that has dropped 10 points.
Nonsense.

That is not a reason to replace the drive (not ecessarily that is).

Exactly.
It is a reason to check your system for possible onset of problems.
But an attribute reaching its threshold is in my experience indication
of a massive problem.

Depends on the attribute. Not all are pre-failure.
But it is an indication that problems have been going on for a long time unnoticed.
If not then you must have noticed problems already without S.M.A.R.T. having to tell you.
I admit your PSU example is convincing, but unless you diagnose
something this severe in the disk environment, you should take
SMART failures seriously.

Of course. It is the first best sign that something is amiss.
But it isn't necessarily the drive that is at fault, but just the
object of suffering.
 
F

Folkert Rienstra

J. Clarke said:
The question is why it is happening. Is the drive itself bad or is
something external to the drive preventing it from operating properly?

Says the troll that is telling everyone to ditch their drives on the first sight of a bad sector showing.
 
F

Folkert Rienstra

kinetic said:
This is what I got from Everest. I guess she's going to the dump, huh?

Say you have 10 bad sectors that were written once during a power dip.
Say you read those bad sectors a lot. Each time the system reads the bad
sector the drive will do that several times because it will retry several
times. To good measure the system will retry also a few times where each
time will be retried again by the drive several times. So for each read you
get retries-times-retries of reads to the bad sector for which each and
every time the raw read error counter increases. Give it enough time and
your error counter reaches threshold. Does that mean your drive is dead?
No, it means that you have 10 bad blocks that are not bad at all but that
were not taken care of, so kept being a nuisance and subsequently logged.

Check your Power Supply and PS cabling, drive temperature.
Wipe the drive (overwrite every sector).
Note down your SMART attributes.
Run a drive exerciser like Bart's DiskTool.
Check your SMART attributes again.

If nothing changed, your drive is good.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top