Do SSD drives really fail a lot ?

Discussion in 'Storage Devices' started by Lynn McGuire, May 3, 2011.

  1. Lynn McGuire

    Lynn McGuire Guest

    Do SSD drives really fail a lot ?
    http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html

    "… I feel ethically and morally obligated to let you in on a
    dirty little secret I've discovered in the last two years of
    full time SSD ownership. Solid state hard drives fail. A lot.
    And not just any fail. I'm talking about catastrophic,
    oh-my-God-what-just-happened-to-all-my-data instant gigafail.
    It's not pretty. "

    Lynn
     
    Lynn McGuire, May 3, 2011
    #1
    1. Advertisements

  2. "Lynn McGuire" <> wrote in message
    news:ipp73a$oo4$...

    > Do SSD drives really fail a lot ?
    >
    > http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html
    >
    > "… I feel ethically and morally obligated to let you in on a
    > dirty little secret I've discovered in the last two years of
    > full time SSD ownership. Solid state hard drives fail. A lot.
    > And not just any fail. I'm talking about catastrophic,
    > oh-my-God-what-just-happened-to-all-my-data instant gigafail.
    > It's not pretty. "


    LM omitted from the next page:
    "Solid state hard drives are so freaking amazing performance wise, and the
    experience you will have with them is so transformative, that I don't even
    care if they fail every 12 months on average! I can't imagine using a
    computer without a SSD any more; it'd be like going back to dial-up internet
    .. . . "


    --
    Don Phillipson
    Carlsbad Springs
    (Ottawa, Canada)
     
    Don Phillipson, May 3, 2011
    #2
    1. Advertisements

  3. Lynn McGuire

    Arno Guest

    Lynn McGuire <> wrote:
    > Do SSD drives really fail a lot ?
    > http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html


    > "? I feel ethically and morally obligated to let you in on a
    > dirty little secret I've discovered in the last two years of
    > full time SSD ownership. Solid state hard drives fail. A lot.
    > And not just any fail. I'm talking about catastrophic,
    > oh-my-God-what-just-happened-to-all-my-data instant gigafail.
    > It's not pretty. "
    > Lynn


    It depends on your usage pattern and the SSD. Failure rate is
    a designed feature with SSDs, i.e. the manufacturers know pretty
    well how much writing an SSD can take. By designing wear-leveling
    and spare capacity, they can design a specific write load
    that kills a drive. In the beginning, this process is shaky
    though and whole drive series can have worse reliability.

    The typical reliability design goal is a 5% failure rate
    per year for an average usage pattern. Consumers are willing
    to tolerate that. That is a real failure rate, but it is
    not "all the time". There are people that think because SSDs
    are not suceptible to mechanical damage, they could do without
    backup. Thise people will lose their data, no matter what
    storage medium it is on, untill some day no money can be saved
    by aiming for that 5% and reliability slowly goes up.

    That said, I think the coding horror person (which has some
    prrry nice things about coding in his blog) has a census of
    mostly early models. These, like any new technology, have
    increased failure rates, as the manufacturers try to aim
    for that 5%/year but make mistakes in the process. It could
    also just be a statistical annomaly.

    There is one additional thing: SSDs are susceptible to
    heat, just like any other electronics and to bad power.
    It is possible that the guy with the 8 of 8 dead deives
    just killed them by overheating or by voltage-spikes
    from a cheap/bad PSU. For heat, rule of thumb is half
    the lifetime every 10C for semiconductors and this works
    pretty well. I have seen it several times now, one a 22
    unit network card sample. As SSDs contain power circutry,
    some parts of them run much hotter (step-up regulators for
    converting 5V to the write-voltage needed), and lifetime
    of 5 years is typically calculated at 40C environmental
    temperature. Run them at 60C and you get 1.25 years average
    lifetime. Other example: Memory and logic chips have something
    like 30 years at 25C (figure from a very old Intel databook).
    Run them at 65C and you get around 2 years lifetime.
    That means you get the first failured (depending on
    sample size) after 1-1.5 years and after 3 years most are
    dead. This incidentally was my intital measurement and
    prediction for the 22 network cards and what happened
    then. Note that high-performance CPUs are different, as
    they are more designed as power semiconductors. But chipsets
    are not. I have seen several fail from inadequate cooling
    in 1-3 years.

    There is one other effect at work here: A lot of people
    expected SSDs to be much more reliable than HDDs.
    They are not in general, see above. This can lead
    to disappointments causing overstatement of the problem.

    Altogether, I don't believe we are seeing more than
    early-adopter problems, and they are always the same.
    Also, there are certainly cheap SSDs and better
    SSDs, just like allways and it is possible to treat SSDs
    well or badly.

    Arno
    --
    Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
    GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
    ----
    Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
     
    Arno, May 4, 2011
    #3
  4. Lynn McGuire

    Franc Zabkar Guest

    On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire <> put
    finger to keyboard and composed:

    >Do SSD drives really fail a lot ?
    > http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html


    The most common reason for failure (90%) in flash drives appears to be
    translator corruption (damaged lookup tables), especially if the power
    fails while the translator is being updated. Afterwards the drive
    powers up in safe mode with a very small capacity.

    What are the Flash drives' typical failures [Public Forum]:
    http://www.salvationdata.com/forum/topic1873.html

    I suspect that SSDs may be similarly affected. Perhaps that's why some
    newer models have large super capacitors for power backup.

    - Franc Zabkar
    --
    Please remove one 'i' from my address when replying by email.
     
    Franc Zabkar, May 17, 2011
    #4
  5. Lynn McGuire

    JW Guest

    On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
    <> wrote in Message id:
    <>:

    >On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire <> put
    >finger to keyboard and composed:
    >
    >>Do SSD drives really fail a lot ?
    >> http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html

    >
    >The most common reason for failure (90%) in flash drives appears to be
    >translator corruption (damaged lookup tables), especially if the power
    >fails while the translator is being updated. Afterwards the drive
    >powers up in safe mode with a very small capacity.
    >
    >What are the Flash drives' typical failures [Public Forum]:
    >http://www.salvationdata.com/forum/topic1873.html
    >
    >I suspect that SSDs may be similarly affected. Perhaps that's why some
    >newer models have large super capacitors for power backup.


    Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
    controller that can cause the device to revert to 8MB during a power
    failure. AFAIK they have not yet publicly announced it, and won't have a
    firmware fix ready for release until the end of July.

    We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
    rep. I was able to kill it in two or three hours by power cycling it.
    Apparently (according to the Intel rep) when the power failure is
    happening, the SSD device tries to reconnect with the SATA port instead of
    initiating a proper shutdown. Something to do with interrupt priority
    being higher for reconnection rather than a proper shutdown.

    I was able to kill their 80GB device as well. We've sent both drives back
    to Intel and they're going to give us their pre-release firmware for
    testing.
     
    JW, May 17, 2011
    #5
  6. Lynn McGuire

    Arno Guest

    JW <> wrote:
    > On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
    > <> wrote in Message id:
    > <>:


    >>On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire <> put
    >>finger to keyboard and composed:
    >>
    >>>Do SSD drives really fail a lot ?
    >>> http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html

    >>
    >>The most common reason for failure (90%) in flash drives appears to be
    >>translator corruption (damaged lookup tables), especially if the power
    >>fails while the translator is being updated. Afterwards the drive
    >>powers up in safe mode with a very small capacity.
    >>
    >>What are the Flash drives' typical failures [Public Forum]:
    >>http://www.salvationdata.com/forum/topic1873.html
    >>
    >>I suspect that SSDs may be similarly affected. Perhaps that's why some
    >>newer models have large super capacitors for power backup.


    > Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
    > controller that can cause the device to revert to 8MB during a power
    > failure. AFAIK they have not yet publicly announced it, and won't have a
    > firmware fix ready for release until the end of July.


    > We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
    > rep. I was able to kill it in two or three hours by power cycling it.
    > Apparently (according to the Intel rep) when the power failure is
    > happening, the SSD device tries to reconnect with the SATA port instead of
    > initiating a proper shutdown. Something to do with interrupt priority
    > being higher for reconnection rather than a proper shutdown.


    > I was able to kill their 80GB device as well. We've sent both drives back
    > to Intel and they're going to give us their pre-release firmware for
    > testing.


    Interesting. Goes to show that firmware development is apparently
    not done any better than other software development. I am tempted
    to run my next SSD through similar tests before using it.

    Arno

    --
    Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
    GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
    ----
    Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
     
    Arno, May 17, 2011
    #6
  7. Lynn McGuire

    JW Guest

    On Tue, 17 May 2011 06:32:45 -0400 JW <> wrote in Message id:
    <>:

    >On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
    ><> wrote in Message id:
    ><>:
    >
    >>On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire <> put
    >>finger to keyboard and composed:
    >>
    >>>Do SSD drives really fail a lot ?
    >>> http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html

    >>
    >>The most common reason for failure (90%) in flash drives appears to be
    >>translator corruption (damaged lookup tables), especially if the power
    >>fails while the translator is being updated. Afterwards the drive
    >>powers up in safe mode with a very small capacity.
    >>
    >>What are the Flash drives' typical failures [Public Forum]:
    >>http://www.salvationdata.com/forum/topic1873.html
    >>
    >>I suspect that SSDs may be similarly affected. Perhaps that's why some
    >>newer models have large super capacitors for power backup.

    >
    >Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
    >controller that can cause the device to revert to 8MB during a power
    >failure. AFAIK they have not yet publicly announced it, and won't have a
    >firmware fix ready for release until the end of July.
    >
    >We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
    >rep. I was able to kill it in two or three hours by power cycling it.
    >Apparently (according to the Intel rep) when the power failure is
    >happening, the SSD device tries to reconnect with the SATA port instead of
    >initiating a proper shutdown. Something to do with interrupt priority
    >being higher for reconnection rather than a proper shutdown.
    >
    >I was able to kill their 80GB device as well. We've sent both drives back
    >to Intel and they're going to give us their pre-release firmware for
    >testing.


    The Pre-release firmware also had the problem. I ended up supplying Intel
    SSD engineering with my test platform and they reproduced the problem and
    have a fix pending. See:
    http://communities.intel.com/thread/24121?tstart=0

    The firmware is not yet released however.

    Looks like this Usenet thread caused quite a bit of commotion on their
    forum:
    http://communities.intel.com/thread/22227?tstart=0

    :)
     
    JW, Aug 16, 2011
    #7
  8. Lynn McGuire

    Arno Guest

    JW <> wrote:
    > On Tue, 17 May 2011 06:32:45 -0400 JW <> wrote in Message id:
    > <>:


    >>On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
    >><> wrote in Message id:
    >><>:
    >>
    >>>On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire <> put
    >>>finger to keyboard and composed:
    >>>
    >>>>Do SSD drives really fail a lot ?
    >>>> http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html
    >>>
    >>>The most common reason for failure (90%) in flash drives appears to be
    >>>translator corruption (damaged lookup tables), especially if the power
    >>>fails while the translator is being updated. Afterwards the drive
    >>>powers up in safe mode with a very small capacity.
    >>>
    >>>What are the Flash drives' typical failures [Public Forum]:
    >>>http://www.salvationdata.com/forum/topic1873.html
    >>>
    >>>I suspect that SSDs may be similarly affected. Perhaps that's why some
    >>>newer models have large super capacitors for power backup.

    >>
    >>Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
    >>controller that can cause the device to revert to 8MB during a power
    >>failure. AFAIK they have not yet publicly announced it, and won't have a
    >>firmware fix ready for release until the end of July.
    >>
    >>We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
    >>rep. I was able to kill it in two or three hours by power cycling it.
    >>Apparently (according to the Intel rep) when the power failure is
    >>happening, the SSD device tries to reconnect with the SATA port instead of
    >>initiating a proper shutdown. Something to do with interrupt priority
    >>being higher for reconnection rather than a proper shutdown.
    >>
    >>I was able to kill their 80GB device as well. We've sent both drives back
    >>to Intel and they're going to give us their pre-release firmware for
    >>testing.


    > The Pre-release firmware also had the problem. I ended up supplying Intel
    > SSD engineering with my test platform and they reproduced the problem and
    > have a fix pending. See:
    > http://communities.intel.com/thread/24121?tstart=0


    This is rather patheric on their side (not so at all on your side,
    obviously).

    > The firmware is not yet released however.


    > Looks like this Usenet thread caused quite a bit of commotion on their
    > forum:
    > http://communities.intel.com/thread/22227?tstart=0


    > :)


    Understandable. The conclusion can only be to stay away from
    Intel SSDs for the next few years, until they have
    demonstrated they their Q/A under control and have started to take
    the date safety of their customers seriously.

    It also underlines somethign I have been saying for a while,
    namely that SSDs should be regarded as less reliable than HDDs at
    this time, because of engineering screw-ups like this one.

    My SSDs are either in a RAID with non-SSDs (with "write mostly"
    that gives SSD read-speeds under Linux software RAID) or do
    not have critical data on them.

    Arno
    --
    Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
    GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
    ----
    Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
     
    Arno, Aug 16, 2011
    #8
  9. Lynn McGuire

    whydoallmyssdfail

    Joined:
    Feb 26, 2012
    Messages:
    1
    Likes Received:
    0
    >> I don't even care if they fail every 12 months
    if you get a ssd to last 12 months that is a miracle !

    this is the lifespan of all the ssd's ive installed :
    ocz solid - 47hrs
    ocz vertex - 3 months
    ocz agility - 11 months

    compare that to mechanical drives ive installed, about 40 over the last 15 years, and only one developed a corrupt sector that warranted backing all my precious data up then reformat to fix. when ssd's fail, they fail bigtime with no warning and you are left with a brick. files - gone, emails - gone, windows - gone - all in a flash.

    however its my own fault. the limitations of flash memory are well known. you can rewrite flash memory cells only about 3k ( cheap stuff 40p/gb ) - 100k ( expensive stuff $10/gb ) times before it freezes up and never be written again. of course my ssd are all going to fail - thats the nature of flash memory which is what ssd is !




     
    Last edited: Feb 26, 2012
    whydoallmyssdfail, Feb 26, 2012
    #9
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. markm75
    Replies:
    18
    Views:
    4,594
    David Lesher
    Dec 23, 2007
  2. Mr. Man-wai Chang

    Re: Win7->SSD, or Win7->HDD->SSD

    Mr. Man-wai Chang, Apr 19, 2013, in forum: Storage Devices
    Replies:
    2
    Views:
    431
    Yousuf Khan
    Apr 19, 2013
  3. Yousuf Khan

    Re: Win7->SSD, or Win7->HDD->SSD

    Yousuf Khan, Apr 19, 2013, in forum: Storage Devices
    Replies:
    7
    Views:
    446
  4. Lynn McGuire

    Intel 240 GB SSD drives are now $150

    Lynn McGuire, Mar 27, 2014, in forum: Storage Devices
    Replies:
    6
    Views:
    322
  5. Joe
    Replies:
    8
    Views:
    296
    Yousuf Khan
    Oct 16, 2014
Loading...

Share This Page