PC Review


Reply
Thread Tools Rate Thread

The case for ECC.

 
 
Alan Walpool
Guest
Posts: n/a
 
      12th Dec 2004
Hi,

I know this issue has been pretty much been run in the ground, but the
last year has changed my mind on this issue.

Two cases.

1) Very old pentium pro system with ecc memory used for a
firewall/bridge. Immediately had a memory parity error show on the
screen and the system halted. Checked the memory with memtest and
sure enough it was bad. Replaced the memory and everything was back
to normal. It took a day to correct the problem, and data was
intact. Memory was noname and no warranty.

2) Had a 2 year old amd athlon system with non-ecc memory and the
system started locking up. One of the disks was corrupted. I
started trying to track the problem down, and continued to have
random system lockups. It got so bad the system was not booting.
Removed all cards but the video card, and still lockups. Finally
checked the memory with memtest, and sure enough the memory was
bad. System was never overclocked and did not have any heat
related problems. Well after data corruption, and 4-5 days of
pulling my hair out, I figured it out. Memory was name brand with a
lifetime warranty, I sent for a RMA on the memory.

The long story is I prefer case #1 over case #2. At least it is easier
to diagnosis the problem with ECC memory. I thought that memory was so
good now that home users did not need ECC memory, and that is what
many regular posters in this newsgroup have said over and over.

The next system I purchase will have ECC memory. My time is well worth
the minor difference in price. Since I don't overclock it is not an
issue. Heck, that fancy overclocking memory costs way more than ECC
memory.

Whatever,

Alan
 
Reply With Quote
 
 
 
 
keith
Guest
Posts: n/a
 
      12th Dec 2004
On Sun, 12 Dec 2004 10:50:55 -0600, Alan Walpool wrote:

> Hi,
>
> I know this issue has been pretty much been run in the ground, but the
> last year has changed my mind on this issue.
>
> Two cases.
>
> 1) Very old pentium pro system with ecc memory used for a
> firewall/bridge. Immediately had a memory parity error show on the
> screen and the system halted. Checked the memory with memtest and
> sure enough it was bad. Replaced the memory and everything was back
> to normal. It took a day to correct the problem, and data was
> intact. Memory was noname and no warranty.
>
> 2) Had a 2 year old amd athlon system with non-ecc memory and the
> system started locking up. One of the disks was corrupted. I
> started trying to track the problem down, and continued to have
> random system lockups. It got so bad the system was not booting.
> Removed all cards but the video card, and still lockups. Finally
> checked the memory with memtest, and sure enough the memory was
> bad. System was never overclocked and did not have any heat
> related problems. Well after data corruption, and 4-5 days of
> pulling my hair out, I figured it out. Memory was name brand with a
> lifetime warranty, I sent for a RMA on the memory.
>
> The long story is I prefer case #1 over case #2. At least it is easier
> to diagnosis the problem with ECC memory. I thought that memory was so
> good now that home users did not need ECC memory, and that is what
> many regular posters in this newsgroup have said over and over.


I'm not sure what regulars have said such here. I have ECC memory
even on my K6-III system. Memory has never been "so good" that it never
fails. My only "issue" with ECC is that I can't test whether it's really
working (why have I never seen an error?). How do I know that any errors
are actually getting reported somewhere so I can take corrective action?

> The next system I purchase will have ECC memory. My time is well worth
> the minor difference in price. Since I don't overclock it is not an
> issue. Heck, that fancy overclocking memory costs way more than ECC
> memory.


ECC memory prices dropped down to the 11% overhead number a long time ago.
Memory for the K6-III was cheap enough in '99 that I figured, "why not?"

--
Keith
 
Reply With Quote
 
Will Dormann
Guest
Posts: n/a
 
      12th Dec 2004
Alan Walpool wrote:
> 1) Very old pentium pro system with ecc memory used for a
> firewall/bridge. Immediately had a memory parity error show on the
> screen and the system halted. Checked the memory with memtest and
> sure enough it was bad. Replaced the memory and everything was back
> to normal. It took a day to correct the problem, and data was
> intact. Memory was noname and no warranty.


Did the motherboard/BIOS support ECC RAM? The thing about ECC ram is
that it should transparently fix 1-bit memory errors.

Or do you think that the RAM has been going bad and the system has been
fixing 1-bit errors and then finally got to the point where it
encountered a 2-bit error?

Either way, I do agree that ECC is nice to have in a system. My current
system has it, and I think it was a nice investment. The only thing I
think would be nice is if my motherboard had some sort of DMI logging
mechanism for memory errors. That way I'd be able to see if the ECC
has done its job at any point during the time I've owned it.


--
-WD
 
Reply With Quote
 
Alan Walpool
Guest
Posts: n/a
 
      12th Dec 2004
>>>>> "keith" == keith <(E-Mail Removed)> writes:

keith> I'm not sure what regulars have said such here. I have ECC
keith> memory even on my K6-III system. Memory has never been "so
keith> good" that it never fails. My only "issue" with ECC is that I
keith> can't test whether it's really working (why have I never seen
keith> an error?). How do I know that any errors are actually getting
keith> reported somewhere so I can take corrective action?

My old pentium pro motherboard has a memory error count in the bios.
At least on the bios I have you can monitor ECC corrections there. If
it gets really bad it will cause a parity error and shutdown the
system.

Depends on the bios and motherboard.

Interesting.

Alan

 
Reply With Quote
 
Alan Walpool
Guest
Posts: n/a
 
      12th Dec 2004
>>>>> "Will" == Will Dormann <(E-Mail Removed)> writes:

Will> Alan Walpool wrote:
>> 1) Very old pentium pro system with ecc memory used for a
>> firewall/bridge. Immediately had a memory parity error show on the
>> screen and the system halted. Checked the memory with memtest and
>> sure enough it was bad. Replaced the memory and everything was
>> back to normal. It took a day to correct the problem, and data was
>> intact. Memory was noname and no warranty.


Will> Did the motherboard/BIOS support ECC RAM? The thing about ECC
Will> ram is that it should transparently fix 1-bit memory errors.

Will> Or do you think that the RAM has been going bad and the system
Will> has been fixing 1-bit errors and then finally got to the point
Will> where it encountered a 2-bit error?

Will> Either way, I do agree that ECC is nice to have in a system. My
Will> current system has it, and I think it was a nice investment.
Will> The only thing I think would be nice is if my motherboard had
Will> some sort of DMI logging mechanism for memory errors. That way
Will> I'd be able to see if the ECC has done its job at any point
Will> during the time I've owned it.

The motherboard bios reported and detected the ECC memory fine. The
bios in that old pentium pro motherboard logs ECC errors. It was
reporting some errors at first but nothing that it could not handle. I
guess it became so bad that it gave it trying to correct the memory
error and halted the system completely with a message saying memory
error. Didn't write down the exact error message.

I guess this all depends on the bios weather it reports errors or not.

I have not checked lately but I seriously doubt desktop PC's have any
logging for ECC errors. Really that old pentium pro system I have was
really a server motherboard at one time.

Interesting.

Alan


 
Reply With Quote
 
Will Dormann
Guest
Posts: n/a
 
      12th Dec 2004
Alan Walpool wrote:
> I have not checked lately but I seriously doubt desktop PC's have any
> logging for ECC errors. Really that old pentium pro system I have was
> really a server motherboard at one time.


Yes, that seems to be the case. The only machines I've used that log
ECC errors are SGI workstations and Dell servers. Nothing desktop-wise,
which is a shame.

There is a linux kernel module that supposedly monitors and reports ECC
errors, but I haven't been able to get it to compile on my Gentoo (2.6
kernel) system.

http://www.anime.net/~goemon/linux-ecc/

Would be nice if Windows had some sort of equivalent functionality...


--
-WD
 
Reply With Quote
 
David Wang
Guest
Posts: n/a
 
      12th Dec 2004
Will Dormann <(E-Mail Removed)> wrote:
> Alan Walpool wrote:
> > I have not checked lately but I seriously doubt desktop PC's have any
> > logging for ECC errors. Really that old pentium pro system I have was
> > really a server motherboard at one time.


> Yes, that seems to be the case. The only machines I've used that log
> ECC errors are SGI workstations and Dell servers. Nothing desktop-wise,
> which is a shame.


The hardware hooks are there for the 925X series chipset.
I haven't looked very hard, but IIRC, they're pretty much
in all the older "high end" desktop chipsets as well, something
like the 875P.

Whether software uses those hooks and log (correctable) 1 bit ECC
errors or not is another story.


--
davewang202(at)yahoo(dot)com
 
Reply With Quote
 
nobody@nowhere.net
Guest
Posts: n/a
 
      13th Dec 2004
On Sun, 12 Dec 2004 10:50:55 -0600, Alan Walpool
<(E-Mail Removed)> wrote:

>Hi,
>
>I know this issue has been pretty much been run in the ground, but the
>last year has changed my mind on this issue.
>
>Two cases.
>
>1) Very old pentium pro system with ecc memory used for a
> firewall/bridge. Immediately had a memory parity error show on the
> screen and the system halted. Checked the memory with memtest and
> sure enough it was bad. Replaced the memory and everything was back
> to normal. It took a day to correct the problem, and data was
> intact. Memory was noname and no warranty.
>
>2) Had a 2 year old amd athlon system with non-ecc memory and the
> system started locking up. One of the disks was corrupted. I
> started trying to track the problem down, and continued to have
> random system lockups. It got so bad the system was not booting.
> Removed all cards but the video card, and still lockups. Finally
> checked the memory with memtest, and sure enough the memory was
> bad. System was never overclocked and did not have any heat
> related problems. Well after data corruption, and 4-5 days of
> pulling my hair out, I figured it out. Memory was name brand with a
> lifetime warranty, I sent for a RMA on the memory.
>
>The long story is I prefer case #1 over case #2. At least it is easier
>to diagnosis the problem with ECC memory. I thought that memory was so
>good now that home users did not need ECC memory, and that is what
>many regular posters in this newsgroup have said over and over.
>
>The next system I purchase will have ECC memory. My time is well worth
>the minor difference in price. Since I don't overclock it is not an
>issue. Heck, that fancy overclocking memory costs way more than ECC
>memory.
>
>Whatever,
>
>Alan

Man, you just made a case for socket 940 - the board is cheaper than
939, the CPU (Opteron) goes for roughly the same price as equivalent
939 (A64FX), the only complaint usually is that registered ECC RAM it
uses is somewhat slower and more expensive. But you want ECC, so 940
is the way to go, unless you are willing to pay an arm and a leg for
slower Xeon.

 
Reply With Quote
 
Grumble
Guest
Posts: n/a
 
      13th Dec 2004
(E-Mail Removed) wrote:

> On Sun, 12 Dec 2004 10:50:55 -0600, Alan Walpool wrote:
>
>>I know this issue has been pretty much been run in the ground, but the
>>last year has changed my mind on this issue.
>>
>>Two cases.
>>
>>1) Very old pentium pro system with ecc memory used for a
>> firewall/bridge. Immediately had a memory parity error show on the
>> screen and the system halted. Checked the memory with memtest and
>> sure enough it was bad. Replaced the memory and everything was back
>> to normal. It took a day to correct the problem, and data was
>> intact. Memory was noname and no warranty.
>>
>>2) Had a 2 year old amd athlon system with non-ecc memory and the
>> system started locking up. One of the disks was corrupted. I
>> started trying to track the problem down, and continued to have
>> random system lockups. It got so bad the system was not booting.
>> Removed all cards but the video card, and still lockups. Finally
>> checked the memory with memtest, and sure enough the memory was
>> bad. System was never overclocked and did not have any heat
>> related problems. Well after data corruption, and 4-5 days of
>> pulling my hair out, I figured it out. Memory was name brand with a
>> lifetime warranty, I sent for a RMA on the memory.
>>
>>The long story is I prefer case #1 over case #2. At least it is easier
>>to diagnosis the problem with ECC memory. I thought that memory was so
>>good now that home users did not need ECC memory, and that is what
>>many regular posters in this newsgroup have said over and over.
>>
>>The next system I purchase will have ECC memory. My time is well worth
>>the minor difference in price. Since I don't overclock it is not an
>>issue. Heck, that fancy overclocking memory costs way more than ECC
>>memory.

>
> Man, you just made a case for socket 940 - the board is cheaper than
> 939, the CPU (Opteron) goes for roughly the same price as equivalent
> 939 (A64FX), the only complaint usually is that registered ECC RAM it
> uses is somewhat slower and more expensive. But you want ECC, so 940
> is the way to go, unless you are willing to pay an arm and a leg for
> slower Xeon.


Registered (also known as buffered) and ECC are two separate features.
You can buy unbuffered ECC DDR SDRAM DIMMs, for example from Kingston:

http://www.ec.kingston.com/ecom/conf...R400X72C3A/512

Click on search to see a list of compatible motherboards.

There are Socket 754 and Socket 939 motherboards which support ECC
memory modules. For example:

http://www.asus.com/prog/spec.asp?m=K8N-E%20Deluxe

--
Regards, Grumble
 
Reply With Quote
 
John
Guest
Posts: n/a
 
      13th Dec 2004
Alan Walpool wrote:
>
> 2) Had a 2 year old amd athlon system with non-ecc memory and the
> system started locking up. One of the disks was corrupted. I
> started trying to track the problem down, and continued to have
> random system lockups. It got so bad the system was not booting.
> Removed all cards but the video card, and still lockups. Finally
> checked the memory with memtest, and sure enough the memory was
> bad. System was never overclocked and did not have any heat
> related problems. Well after data corruption, and 4-5 days of
> pulling my hair out, I figured it out. Memory was name brand with

a
> lifetime warranty, I sent for a RMA on the memory.



I feel your pain. It's wise to consider ECC.

When a system has disk corruption, crashes, or blue screens I reach for
MEMTEST first (Disk Doctor second). You can screw up memory fiddling
with hardware, it can fail on it's own, or due to a power spike, it can
even glitch when a cosmic ray hits it (at least that use to be a
worry), it can be running under marginal and deteriorating conditions,
etc.

However, for non-critical use, if you buy "reasonable quality" (ps,
motherboard, memory, cooling), operate within manufacturer's
parameters, and perform burn in testing - you'll be ok. Keep MEMTEST
handy. In my experience memory failure hasn't been an issue for years
and years. If in doubt, ask your local hardware shop what they think of
current configurations. I respect the expertise of good local shops,
espcially if they warrant what they sell.

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Minor puzzle: some UDF calls respect mixed case, others insist onlower case Dave Peterson Microsoft Excel Programming 0 1st May 2010 12:32 AM
Re: Minor puzzle: some UDF calls respect mixed case, others insist on lower case JLGWhiz Microsoft Excel Programming 2 30th Apr 2010 09:09 PM
Comparing text fields to find upper case lower case mismatches RAN Microsoft Access Queries 3 4th Dec 2008 04:34 PM
Can't find short cut for changing case ... upper case .... lower case JERRY Microsoft Word New Users 7 23rd Aug 2007 05:29 PM
Lower case, upper case mish mash in Headings-based bookmarks =?Utf-8?B?UnV0YWJhZ2E=?= Microsoft Word Document Management 3 10th May 2007 10:17 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 06:28 AM.