bad sectors on a mybook WD usb drive

S

sobriquet

:
:In fact the Everest SMART report shows that it actually got to 87C and that is utterly obscene.

I don't get that from the report.

    Attribute Description  Threshold  Value  Worst       Data
    ---------------------  ---------  -----  -----   -------------
  C2   Temperature             0        89     87         63

That "87" is the _normalized_ parameter -- a value that drops with
increasing temperature and indicates "fail" status when it falls to the
threshold value of 0.  Note that the current value is "89" while the
value in the "Worst" column is "87".  That makes no sense if those
values are actual temperatures.  No, it appears that the drive was
never much hotter than at the time that measurement was taken.

It would be really interesting to see what those numbers are when the
drive is first switched on after an extended power off cooldown, when
the drive is still near the ambient temperature.


Here is a screenshot of the SMART info from Everest when the drive has
just been
turned on after the power has been off for a while.


http://img27.imageshack.us/img27/3280/everest2t.jpg
 
A

Arno

Franc Zabkar said:
On 19 Feb 2010 14:37:00 GMT, Arno <[email protected]> put finger to
keyboard and composed:
I was referring to the normalised attribute value.
Wikipedia is unclear, but it does mention something along those lines
for attribute BE:
http://en.wikipedia.org/wiki/S.M.A.R.T.

Well, some more datapoints from 4 of my WD disks:

Raw C
106 41
115 32
110 40
128 22


The linear regression tool at http://www.xuru.org/rt/LR.asp
gives me

y = -0.907188353 x + 137.8498635

and an error of up to 1.95C. That would give 47.1C for 100.
Seems there is a rather large rounding error or the like in
here.

Anyways, with OPs 89 would be 57C from my datapoints,
59C if I add 100/50C.

Interesstingly, if I use only the lowest datapoint
(128/22C) and the theoretical 100/50C, I get

y = -1 x + 150

I suspect there is some dampening or averaging or the
like going on with the cooked value and I also
suspect temp = -1 x cooked + 150 [C] is what we
want.

With that the OPs 89 and worst 87 become
61C and 63C which is definitely far too high for
comfort. The 63C is high enough that it could
have caused enough (temprary) degradation for
the 6000 reallocated sectors.

In fact, borderline failure due to significant overheating
seems to be the most likely cause to me now.


Arno
 
A

Arno

sobriquet said:
Here is a screenshot of the SMART info from Everest when the drive has
just been
turned on after the power has been off for a while.

http://img27.imageshack.us/img27/3280/everest2t.jpg

Fits. With the linear regression from my other posting,
it looks like your disk went up to something like 63C,
and that could be enough to degrade its mechanics and
electronics enough to have caused a large number of
errors.

To sum up: It looks like you nearly cooked your disk to
death and the 6000 reallocated sectors happened when it
was close to to failing completely.

Note that there are 3 stages to heat death (with my personal
estimation when they happen, depends also on the drive):

1. Starts to produce errors [60-70C]: you were there
2. Fails, but works again after cooldown [65-75C]
3. Fails permanently or suffers permanent damage [?]

In all stages the disk ages very rapidly and may fail soon.
I would also not really trust a disk anymore that has reached
stage 2.

Arno
 
R

Rod Speed

sobriquet said:
But that 40 number for the hitachi drive is in the same column as the 63 for the WD drive..

Yes, but there are two numbers in that column with the Hitachi.
and I don't understand the relationship between
the raw values and the value/worst numbers, or does
that differ between various brands/models of HDs?

Yes it does.
 
R

Robert Nichols

:
:> Here is a screenshot of the SMART info from Everest when the drive has
:> just been
:> turned on after the power has been off for a while.
:
:
:> http://img27.imageshack.us/img27/3280/everest2t.jpg
:
:Fits. With the linear regression from my other posting,
:it looks like your disk went up to something like 63C,
:and that could be enough to degrade its mechanics and
:electronics enough to have caused a large number of
:errors.

Plus, it confirms that the raw value (28) is indeed the Celsius
temperature.

:To sum up: It looks like you nearly cooked your disk to
:death and the 6000 reallocated sectors happened when it
:was close to to failing completely.
:
:Note that there are 3 stages to heat death (with my personal
:estimation when they happen, depends also on the drive):
:
: 1. Starts to produce errors [60-70C]: you were there
: 2. Fails, but works again after cooldown [65-75C]
: 3. Fails permanently or suffers permanent damage [?]
:
:In all stages the disk ages very rapidly and may fail soon.
:I would also not really trust a disk anymore that has reached
:stage 2.

And, unless that drive has been operating in an unusually hot
environment (sitting on top of a hot air register, maybe?) I'd scream
bloody Hell to WD about the unconscionably bad thermal design of that
enclosure.
 
A

Arno

Robert Nichols said:
:
:> Here is a screenshot of the SMART info from Everest when the drive has
:> just been
:> turned on after the power has been off for a while.
:
:
:> http://img27.imageshack.us/img27/3280/everest2t.jpg
:
:Fits. With the linear regression from my other posting,
:it looks like your disk went up to something like 63C,
:and that could be enough to degrade its mechanics and
:electronics enough to have caused a large number of
:errors.
Plus, it confirms that the raw value (28) is indeed the Celsius
temperature.
Indeed.

:To sum up: It looks like you nearly cooked your disk to
:death and the 6000 reallocated sectors happened when it
:was close to to failing completely.
:
:Note that there are 3 stages to heat death (with my personal
:estimation when they happen, depends also on the drive):
:
: 1. Starts to produce errors [60-70C]: you were there
: 2. Fails, but works again after cooldown [65-75C]
: 3. Fails permanently or suffers permanent damage [?]
:
:In all stages the disk ages very rapidly and may fail soon.
:I would also not really trust a disk anymore that has reached
:stage 2.
And, unless that drive has been operating in an unusually hot
environment (sitting on top of a hot air register, maybe?) I'd scream
bloody Hell to WD about the unconscionably bad thermal design of that
enclosure.

I am a bit surprised by this. I have several WD elements
1TB and 1.5TB, both im the older aluminum and the newer
plastic case, and they do not have anything like this
problem. What I see is something like 15C over ambient
temperature.

If the MyBook drives get that much hotter, then WD seems
to have messed up badly. Not that this would surprise me.
There are far too many companies hiring young, inexperienced,
cheap engineers for design work.

The last instance I had the misfortune to run in was a new ASUS
mainboard with thermal design so bad it died within a week. The
northbridge cooler was thermally a bit on the small side, but at the
same time mechanically on the large and attached so badly that a light
touch would tear it loose from the chip and kill the chip. When I
then found thermal grease incompetenly applied over the not removed
(!) phase change pad on the replacement board, I decided to not buy
ASUS again. This looks very much like cutting cost a bit too much and
not noticeing it. Not something an experienced enginneer does, but a
typical beginners mistake. The ones truely responsible are of course
those that hired the inexperienced engineers and did not give them
experienced support and supervision, i.e. this is very likely a
management mess-up.

Arno
 
R

Robert Nichols

:
:The last instance I had the misfortune to run in was a new ASUS
:mainboard with thermal design so bad it died within a week. The
:northbridge cooler was thermally a bit on the small side, but at the
:same time mechanically on the large and attached so badly that a light
:touch would tear it loose from the chip and kill the chip. When I
:then found thermal grease incompetenly applied over the not removed
:(!) phase change pad on the replacement board, I decided to not buy
:ASUS again. This looks very much like cutting cost a bit too much and
:not noticeing it. Not something an experienced enginneer does, but a
:typical beginners mistake. The ones truely responsible are of course
:those that hired the inexperienced engineers and did not give them
:experienced support and supervision, i.e. this is very likely a
:management mess-up.

OTOH, my recent ASUS motherboard (M4A78T-E) has been performing
splendidly for 3 months now.

As with most things, YMMV.
 
B

Bob Willard

Arno said:
There are far too many companies hiring young, inexperienced,
cheap engineers for design work.

The last instance I had the misfortune to run in was a new ASUS
mainboard with thermal design so bad it died within a week. The
northbridge cooler was thermally a bit on the small side, but at the
same time mechanically on the large and attached so badly that a light
touch would tear it loose from the chip and kill the chip. When I
then found thermal grease incompetenly applied over the not removed
(!) phase change pad on the replacement board, I decided to not buy
ASUS again. This looks very much like cutting cost a bit too much and
not noticeing it. Not something an experienced enginneer does, but a
typical beginners mistake. The ones truely responsible are of course
those that hired the inexperienced engineers and did not give them
experienced support and supervision, i.e. this is very likely a
management mess-up.

Arno
I do suspect, based on limited evidence, that there has been a
reduction in product quality in recent years. Some examples of
my oldies-but-goodies:

1.My K7M and its 500 MHz Athlon CPU have been running 24x7 since
Dec'99. I did replace the PS once and I did add a HD twice, and
I think I added RAM once, but the MB+CPU are original.

Stability may be due to continued use of Win98. ;-)

2.My primary printer for this multiple-PC site is a HPLJ4L, bought
in Jul'94. It uses a couple of cartridges a year (light duty).
Other than a paper jam roughly every six months, reliability
is incredible.

Meanwhile, I'm on my 5th or 6th InkJet, from various vendors.
They eat color cartridges, clog badly, and don't handle paper
well. I push everybody here to stick to B&W for printouts.

<Black helicopter alert> Could it be that hardware vendors have
gotten jealous of the level of quality that PC software vendors
get by with, and have tried to lower their quality standards
to match M$? <Cancel alert>
 
A

Arno

Bob Willard said:
<Black helicopter alert> Could it be that hardware vendors have
gotten jealous of the level of quality that PC software vendors
get by with, and have tried to lower their quality standards
to match M$? <Cancel alert>

Hehehe, makes a lot of sense to me!

Arno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top