Raid5-Problem - two hard disks out of service - Server still running

B

beruehm

Sorry for posting in german. When I posted the last message I saw that
this newsgroup was only english

Hi,

I am administrating a windows 2000 domain server with five hard disks,
running under Raid 5. Now I have lost two hard disks one after another
(please do not ask my why they gave up service). The server is still
available with ping, that's all.

Running services are: ssh server, vnc server
I cannot access the server via vnc.
I cannot mount any shares.
I cannot logon at the domain from any client.
Any ideas how to access the data?

Thanks very much in advance for any hint.

Bernd
 
J

Jim Howes

beruehm said:
I am administrating a windows 2000 domain server with five hard disks,
running under Raid 5. Now I have lost two hard disks one after another
(please do not ask my why they gave up service). The server is still
available with ping, that's all.

Unfortunately, RAID5 arrays only maintain data integrity when only one disk has
failed. With two drives out, your data is pretty much inaccessible (due to
having been striped across all of the disks in the set). If you can somehow
persuade even one of the discs to come back up, you should be able to recover
the other disk, and once that has been fully recovered, you can (and should)
replace and recover the not-quite-dead-yet disk, and run some meaningful tests
on your remaining array disks.

The problem is that RAID reconstructs data from a single disk failure using a
parity disk. The difference between RAID 4 and RAID 5 is that RAID 5 spreads
the parity blocks across all volumes.
http://www.lascon.co.uk/d008005.htm gives a very good indication of what is
going on.

In summary, if the bits from a particular position across an array of disks are:

1 0 1 1 1

and the third disk fails

1 0 ? 1 1

the controller knows that there should be even parity across the array, so ?=1 and

If two disks fail

* 0 ? 1 1

All we know is that ? and * are the same (to maintain even parity it has to be 1
0 1 1 1 or 0 0 0 1 1). If one of the known 0's or 1's were different to the
above, then ? and * would be opposites (but there are still two valid combinations)

Unless you can reduce the number of unknowns to 1, by somehow encouraging one of
your failed disks to run, just long enough to let the raid controller
reconstruct the array using a new disk for the other failed drive, then you have
no option but to replace both volumes, re-create the array from scratch and
restore data from a backup.

If the data is that critical, you may want to consider mirroring data across
multiple RAID-5's. But this becomes extremely expensive and impractical
compared to a good backup system.

Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top