Terrible ServeRAID 4Lx performance

M

Michael Brown

I've got a ServeRAID 4Lx installed into a 64/66 slot on a MSI K7D
motherboard (AMD MPX). Connected to the card are 5 first-gen Seagate X15's.
However, there is something seriously wrong somewhere.

When using only a single drive, the read rate (from HDTune, HDTach, or just
copying files around) is flat at 25MB/sec across the whole drive, though it
has a "heart monitor" pattern of regular spikes (though only up to
~26MB/sec). Access times are fine, around or slightly below 7ms depending on
which utility you trust. I forgot to note down the single-drive write
speeds, but I think they were about the same.

When adding in the remaining 4 drives and constructing a RAID5 array (8kb
stripe size), the read performance drops to 18MB/sec, again keeping the
heartbeat pattern but this time with 3 small spikes then a larger one. Write
performance tanks to around 4MB/sec measured crudely by copying a 1GB file
to the drive. ATTO returns nutty numbers in all cases even when using an
ancient 2GB drive, though always topping out at ~88MB/sec for reads.

According to ServeRAID Manager, the channel is running at U160, and I
haven't had any problems with losing drives so I'm guessing that side of
things is OK. The drives are in write-back mode, though the logical drives
themselves are not.

However, there is one thing that is not right (besides the performance). The
card BIOS and firmware are at version 4.80.26 and the driver version is
7.10.18. Updating the firmware using the IBM floppy disks does not work: it
goes through the motions of loading stuff off the disks, then immediately
errors out when it goes to write with the error code "EC: 04h-46h".
Inspection of the IBM manuals were not enlightening on this ...

I'd like to get this going at a reasonable speed, even though it's more for
"fun" than anything too serious. I'm downloading the IBM ServeRAID CD in the
hopes that it can either provide some additional tweaks or update the
firmware properly (or both), but I'd like to know if I'm going up against an
unsolvable problem with respect to the card or the motherboard.
 
C

Curious George

Don't expect great RAID-5 performance on this card. I think that's an
i960 with 32MB ram.

When you burn the support CD update the drivers so they match the CD.
Restart the computer and allow it to boot from the CD (IIRC you need
at least 128MB ram and a not too complex setup to boot). The CD will
load linux and check & update the BIOS ver. You need matching BIOS &
drivers to work correctly.

Don't blame me if you hose the card trying to update it. Try older
drivers first if you like.
 
M

Michael Brown

Curious said:
Don't expect great RAID-5 performance on this card. I think that's an
i960 with 32MB ram.

Correct. However, I would expect that with a single drive it would not
bottleneck at 25MB/sec when reading ...
When you burn the support CD update the drivers so they match the CD.
Restart the computer and allow it to boot from the CD (IIRC you need
at least 128MB ram and a not too complex setup to boot). The CD will
load linux and check & update the BIOS ver. You need matching BIOS &
drivers to work correctly.

Ahhh hah hah, I just discovered a little thing on the IBM site. Apparently
you can't go straight from 4.85 or earlier to 7.10B as they updated a few
things to allow larger firmware. You have to first update to 7.00 and then
update again to 7.10B. Anyhow, that all done, nothing has changed. The card
still shows exactly the same (terrible) performance as before. Any other
suggestions (including cheap "known good" adapters that will give reasonable
performance from this set of disks)?

[...]

I've also messed around with the PCI latency timer settings but there was no
difference between 32 (the default), 64, and 128.
 
C

Curious George

Correct. However, I would expect that with a single drive it would not
bottleneck at 25MB/sec when reading ...
Right


Ahhh hah hah, I just discovered a little thing on the IBM site. Apparently
you can't go straight from 4.85 or earlier to 7.10B as they updated a few
things to allow larger firmware. You have to first update to 7.00 and then
update again to 7.10B.

Sorry 'bout that. Never moved from 4x to 7x. Exposure to 4x is such
a long time ago...
Anyhow, that all done, nothing has changed. The card
still shows exactly the same (terrible) performance as before. Any other
suggestions (including cheap "known good" adapters that will give reasonable
performance from this set of disks)?

I've used retail 15K cheetahs & 7.2K 10K Quantums/Maxtors on the 4L in
non-IBM boards. Very fast in RAID 0,1,1E,1+0, & JBOD. No probs.

compatibility with mobos is very spotty, though. I couldn't get it to
post at all in a Athlon MP (& several others). Some Intel &
Supermicro seem to be better bets (but no assurance).

Right now I'd take a closer look at the setup, including cabling, etc.
as well as your testing methodologies. Also might want to try it on
another board also (something simple like even an i845, etc.).

Moving very small files @25MB/sec may after all be respectable. The
benchmarks results may not be trustworthy. Outside of SW limitations,
are you testing a raid volume that is the system partition? I can
think of one Serveraid setup with I think Atlas V's which seemed to
tank at 25MB/sec on C: with ATTO but did much better on an empty
volume & array (except raid5 write performance was atrocious).
Spindle sync (which you may not be able to do) helped a lot for single
user/few user work.

Not solutions really but worth considering
[...]

I've also messed around with the PCI latency timer settings but there was no
difference between 32 (the default), 64, and 128.

Also try the largest stripe-unit size, adaptive read-ahead cache,
matching write-through cache on each drive. For a single drive or
similar it should really perform like a vanilla U160 card
 
M

Michael Brown

Curious said:
I've used retail 15K cheetahs & 7.2K 10K Quantums/Maxtors on the 4L in
non-IBM boards. Very fast in RAID 0,1,1E,1+0, & JBOD. No probs.

compatibility with mobos is very spotty, though. I couldn't get it to
post at all in a Athlon MP (& several others). Some Intel &
Supermicro seem to be better bets (but no assurance).

I found an interesting article about the IWill MPX2 and SCSI RAID cards
(after I'd bought the card, of course :| )
http://www.burningissues.net/hard/iwill/MPX2p4.htm
I get similar ATTO results, and I found a thread (but can't find it again)
on the 2CPU forums where another person had similar problems with a 4Mx on
another MPX board. It looks like the ServeRAID 4's just don't like the
chipset unfortunately.
Right now I'd take a closer look at the setup, including cabling, etc.
as well as your testing methodologies. Also might want to try it on
another board also (something simple like even an i845, etc.).

I'm going to be trying it on another board (Soltek 75DRV5, Via KT333 based)
probably this weekend. I have an Intel board but taking it out of service
for a day to mess around with would not make me too popular ;)

I've done a few more low-level tests with HD Tune. I tested with two drives
in RAID0, RAID1, and software (Windows) RAID0 (one physical drive per
logical drive), and four drives in RAID0, RAID10, and software RAID0. Also,
I tested multiple copies of HD Tune at one per drive from 1 to 5 drives (ie:
one copy running on logical drive 1, one running on logical drive2, etc).
The results were essentially that there were two limits: 25MB/sec per
logical drive and 40MB/sec overall. For example, the multiple-HDTune tests
gave the following results:
1 drive: 25MB/sec
2 drives: 19MB/sec per drive
3 drives: 13MB/sec per drive
4 drives: 10MB/sec per drive
5 drives: 8MB/sec per drive
In all cases the transfer rate line was flat with the heartbeat pattern, and
except for software RAID0 were done on unformatted/unpartitioned drives.
Software RAID0 was tested by a self-written program that wrote a huge random
file to the drive then read blocks of it back. Not as good as HD Tune but it
agreed with the results of multiple drives in JBOD mode.

Additionally, I tried setting up a RAID1+hotspare and taking one of the
drives offline by marking it as defunct. The rebuild took 15m48s, which
gives a rebuild rate of 18.5MB/sec (give or take) and an overall cable usage
of ~37MB/sec. This should be independent of any PCI or driver problems, and
done on the card, I would assume.

So, it looks like there's at least two problems. The first to look at is why
it seems to be bottlenecked at 40MB/sec overall. This looks a lot like the
bus is being forced into SE mode. Since I'm using 68->80 adapters, this was
the first suspect even though they're supposedly U320 compatible. I tried a
single adapter and drive, and two adapters and drives, at various positions
along the cable. I repeated this with various combinations of the 5 adapters
I've got. Same results as before, so the adapters are either completely
broken or OK. The cable is also supposedly U320 capable. It's got a build-in
LVD/SE terminator made by AMP by the looks of it.

The main problem is that I don't have any spare SCSI bits except 80-pin hard
drives. Also, as far as I can tell there's no way to get any information
from the ServeRAID about per-drive limits or anything. Only an overall
"cable speed" which is being reported as U160 regardless of what I plug into
the cable. So it seems like I'm basically reduced to shot-in-the-dark sort
of troubleshooting, unless there's some diagnostic tool I don't know about
(very likely, as I've never dealt with ServeRAIDs before).

[...]
Spindle sync (which you may not be able to do) helped a lot for single
user/few user work.

I'm using 68->80 adapters so I can't do spindle sync even if the drives
support it (of which there is no mention on the Seagate site).

[...]

Many thanks for the help!
 
C

Curious George

I found an interesting article about the IWill MPX2 and SCSI RAID cards
(after I'd bought the card, of course :| )
http://www.burningissues.net/hard/iwill/MPX2p4.htm
I get similar ATTO results, and I found a thread (but can't find it again)
on the 2CPU forums where another person had similar problems with a 4Mx on
another MPX board. It looks like the ServeRAID 4's just don't like the
chipset unfortunately.

I was afraid of this as they like the MP chipset (& many others) much
worse. It may still be useable for raid _protection_ even if timings
prevent optimal raid _performance_.
I'm going to be trying it on another board (Soltek 75DRV5, Via KT333 based)
probably this weekend. I have an Intel board but taking it out of service
for a day to mess around with would not make me too popular ;)

Not sure if it will even post on that VIA chipset. I guess you'll
find out soon.
I've done a few more low-level tests with HD Tune. I tested with two drives
in RAID0, RAID1, and software (Windows) RAID0 (one physical drive per
logical drive), and four drives in RAID0, RAID10, and software RAID0. Also,
I tested multiple copies of HD Tune at one per drive from 1 to 5 drives (ie:
one copy running on logical drive 1, one running on logical drive2, etc).
The results were essentially that there were two limits: 25MB/sec per
logical drive and 40MB/sec overall. For example, the multiple-HDTune tests
gave the following results:
1 drive: 25MB/sec
2 drives: 19MB/sec per drive
3 drives: 13MB/sec per drive
4 drives: 10MB/sec per drive
5 drives: 8MB/sec per drive
In all cases the transfer rate line was flat with the heartbeat pattern, and
except for software RAID0 were done on unformatted/unpartitioned drives.
Software RAID0 was tested by a self-written program that wrote a huge random
file to the drive then read blocks of it back. Not as good as HD Tune but it
agreed with the results of multiple drives in JBOD mode.

I think we agree it is flat because it's not the drives that are the
bottleneck. It may be an incompatibility issue rather than a bum
board or wrong bus speed setting. IMHO hard to tell the exact cause
just from the numbers.
Additionally, I tried setting up a RAID1+hotspare and taking one of the
drives offline by marking it as defunct. The rebuild took 15m48s, which
gives a rebuild rate of 18.5MB/sec (give or take) and an overall cable usage
of ~37MB/sec. This should be independent of any PCI or driver problems, and
done on the card, I would assume.

Yes but the card expects to do this in the _background_ so you can't
really always expect to use a rebuild to discern max performance. At
the very least the time it takes changes with the priority you assign
it.
So, it looks like there's at least two problems. The first to look at is why
it seems to be bottlenecked at 40MB/sec overall. This looks a lot like the
bus is being forced into SE mode. Since I'm using 68->80 adapters, this was

Maybe or maybe not. There is scsi command overhead I think in the
neighborhood of 20%. So 37-40MB/sec seems high.
the first suspect even though they're supposedly U320 compatible. I tried a
single adapter and drive, and two adapters and drives, at various positions
along the cable. I repeated this with various combinations of the 5 adapters
I've got. Same results as before, so the adapters are either completely
broken or OK. The cable is also supposedly U320 capable. It's got a build-in
LVD/SE terminator made by AMP by the looks of it.

Do you have an U160 or better 68 pin drive or alt 80 pin carrier? If
the adapters are all the same model they are made to the same specs.
A specific defective unit is less likely.
The main problem is that I don't have any spare SCSI bits except 80-pin hard
drives. Also, as far as I can tell there's no way to get any information
from the ServeRAID about per-drive limits or anything. Only an overall
"cable speed" which is being reported as U160 regardless of what I plug into
the cable. So it seems like I'm basically reduced to shot-in-the-dark sort
of troubleshooting, unless there's some diagnostic tool I don't know about
(very likely, as I've never dealt with ServeRAIDs before).

When the controller is set to the "Optimal" setting it is a little
confusing what it is actually doing. IIRC the command line tools can
be helpful here.
[...]
Spindle sync (which you may not be able to do) helped a lot for single
user/few user work.

I'm using 68->80 adapters so I can't do spindle sync even if the drives
support it (of which there is no mention on the Seagate site).

IBM was messing around with a master independent spindle sync
methodology some time ago. If everything happens to be friendly you
can enable it simply by jumpering the sync pins without a wire or RPL
modepage setting. I don't think it is possible on 15k's; Seageate's
official stance is RPL is obsolete and unavailable on their 15K
cheetahs. It's not going to massively affect total throughput anyway;
i.e. it may help some but not solve your problem.
[...]

Many thanks for the help!

I think you're stuck with this HW combo. Since I probably haven't
helped much hope its been entertaining.
 
F

Folkert Rienstra

Michael Brown said:
I found an interesting article about the IWill MPX2 and SCSI RAID cards
(after I'd bought the card, of course :| )
http://www.burningissues.net/hard/iwill/MPX2p4.htm
I get similar ATTO results, and I found a thread (but can't find it again)
on the 2CPU forums where another person had similar problems with a 4Mx on
another MPX board. It looks like the ServeRAID 4's just don't like the
chipset unfortunately.


I'm going to be trying it on another board (Soltek 75DRV5, Via KT333 based)
probably this weekend. I have an Intel board but taking it out of service
for a day to mess around with would not make me too popular ;)

I've done a few more low-level tests with HD Tune. I tested with two drives
in RAID0, RAID1, and software (Windows) RAID0 (one physical drive per
logical drive), and four drives in RAID0, RAID10, and software RAID0. Also,
I tested multiple copies of HD Tune at one per drive from 1 to 5 drives (ie:
one copy running on logical drive 1, one running on logical drive2, etc).
The results were essentially that there were two limits: 25MB/sec per
logical drive and 40MB/sec overall. For example, the multiple-HDTune tests
gave the following results:
1 drive: 25MB/sec
2 drives: 19MB/sec per drive
3 drives: 13MB/sec per drive
4 drives: 10MB/sec per drive
5 drives: 8MB/sec per drive
In all cases the transfer rate line was flat with the heartbeat pattern, and
except for software RAID0 were done on unformatted/unpartitioned drives.
Software RAID0 was tested by a self-written program that wrote a huge random
file to the drive then read blocks of it back. Not as good as HD Tune but it
agreed with the results of multiple drives in JBOD mode.

Additionally, I tried setting up a RAID1+hotspare and taking one of the
drives offline by marking it as defunct. The rebuild took 15m48s, which
gives a rebuild rate of 18.5MB/sec (give or take) and an overall cable usage
of ~37MB/sec. This should be independent of any PCI or driver problems, and
done on the card, I would assume.

So, it looks like there's at least two problems. The first to look at is why
it seems to be bottlenecked at 40MB/sec overall.
This looks a lot like the bus is being forced into SE mode.

Nope, you can't get 40MB/s out of a 20MHz bus. There's overhead too.
So if you get 40MB/s you are running LVD at over 20MHz, at a minimum.

And for a single drive on SE you should at least get in the 30-35MB/s
range for those X15 drives so your problems are different.
Since I'm using 68->80 adapters, this was
the first suspect even though they're supposedly U320 compatible. I tried a
single adapter and drive, and two adapters and drives, at various positions
along the cable. I repeated this with various combinations of the 5 adapters
I've got. Same results as before, so the adapters are either completely
broken or OK. The cable is also supposedly U320 capable. It's got a build-in
LVD/SE terminator made by AMP by the looks of it.

The main problem is that I don't have any spare SCSI bits except 80-pin hard
drives. Also, as far as I can tell there's no way to get any information
from the ServeRAID about per-drive limits or anything. Only an overall
"cable speed" which is being reported as U160 regardless of what I plug into
the cable. So it seems like I'm basically reduced to shot-in-the-dark sort
of troubleshooting, unless there's some diagnostic tool I don't know about
(very likely, as I've never dealt with ServeRAIDs before).

[...]
Spindle sync (which you may not be able to do) helped a lot for single
user/few user work.

I'm using 68->80 adapters so I can't do spindle sync even if the drives
support it (of which there is no mention on the Seagate site).

Well, you can if the adapter supports it (has the pins).
[...]

Many thanks for the help!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top