Poor raid 1 performance?


G

Gerhard Fiedler

Rod said:
Dangerous business.

I know, but gotta bite the bullet once in a while :)

Thanks for sharing your ideas. I won't switch methods immediately --
especially since my various RAID1 setups are already set and work fine --,
but much of what you say makes sense and will find its way into future
considerations.

Gerhard
 
Ad

Advertisements

G

Gerhard Fiedler

Peter said:
Actually, you might be right to some extent.

I have to admit I didn't fully understand Antoine's answer; I don't know
enough about how the data is actually stored on disk. Need to look a bit
more into that probably to understand the issues here.
If I imagine a huge file (let say 100GB) in a RAID1 and have a gigantic
cache (let say 50GB), then controller might read the first half of the file
at the same time as it caches a second half from another disk. And voila, a
whole 100GB read in just half of the time needed to read it from a single
disk.

Wouldn't need a gigantic cache: it can pump the data out to the IDE bus as
it comes in. Since the IDE bus is faster than the disks, it can read, say,
the 1st track on disk A and at the same time the 2nd track on disk B. While
it does that, it sends the data from disk A out the bus. While it starts
reading the 3rd track on disk A and the 4th track on disk B, it dumps all
the 2nd track data (that was in the cache) through the bus, and catches up
with 3rd track data when the 2nd track is all out. And so on. Sounds like
it could be almost twice as fast as a single disk... ideally, of course,
with large, sequentially stored files.

Gerhard
 
G

Gerhard Fiedler

Peter said:
Read up on TwinStor (3ware). They have something called RAID1+0 on just two
disks. Is that mirroring? I don't know. Will a single drive still work
(reattached to a different controller) when 3ware controller dies, no idea.

http://japan.3ware.com/products/pdf/Twinstor.pdf

This sounds exactly like what I was talking about. It also sounds as if
each mirror drive would be able to run as a single drive (just as any other
RAID1); the only difference between what they do and "normal" RAID1 seems
to be that they optimize the placement of the data, considering that they
can read from both drives, and then use both drives concurrently when
reading data.

Gerhard
 
R

Rod Speed

Gerhard Fiedler said:
Peter wrote
I have to admit I didn't fully understand Antoine's answer; I don't
know enough about how the data is actually stored on disk. Need
to look a bit more into that probably to understand the issues here.

The problem has nothing to do with how the data is stored on the
disks, the problem is that the RAID doesnt get requests for the
data in a way that allows it to decide that it makes sense to get the
data from both drives simultaneously with long sequential reads.
 
P

Peter

Actually, you might be right to some extent.
The problem has nothing to do with how the data is stored on the
disks, the problem is that the RAID doesnt get requests for the
data in a way that allows it to decide that it makes sense to get the
data from both drives simultaneously with long sequential reads.

Look at RAID1 section:
http://www.xbitlabs.com/articles/storage/display/3ware-7850.html
or:
http://www.storagereview.com/articles/200107/200107037410_2.html
or page 38 here:
http://research.microsoft.com/BARC/Sequential_IO/Win2K_IO_MSTR_2000_55.pdf

Benchmarks show some substantial improvement over a single drive.
 
R

Rod Speed

Peter said:

Irrelevant to that point I was making.
Benchmarks show some substantial improvement over a single drive.

But not with LONG SEQUENTIAL READS getting the
2 times performance that is theoretically possible if
the raid system can work out that its going to be
getting a request for that long sequential read.

It isnt even viable to read ahead in the hope that
the as yet unrequested stuff will be requested later,
basically because there is no where to put that read ahead.
 
Ad

Advertisements

F

Folkert Rienstra

Antoine Leca said:
In news:[email protected], Gerhard Fiedler va escriure:
Of course there are no RAID0 stripes in RAID1. But for reading, a
stripe is nothing more than a subset of data.

The difference between RAID 0 and RAID 1 is the distance between one stripe
and the next on disk. With RAID 0, they are contiguous for both reading and
writing. With RAID 1, you cannot make them contiguous or near enough for
both writing [every sector] and reading [every two sectors].

So that's obviously not the way to do it.
So the obvious organisation is to make OK for writing (so just like a normal disk),
and while sequential reading you should wait while
skipping the odd-numbered sector which is in between.

That's silly.
You divide a single request into two and send each half to a different drive.
With consequtive requests (for a sequential file) you sent half the number of
total requests to one drive and the other half to the other one, resulting in
both drives reading sequentially.
There is a gain though (half traffic on the wire, for example),

That's not a perceptive gain.
 
F

Folkert Rienstra

Peter said:

On the full drive sequential benchmark level, not on the single sequential file level.
On the single sequential file level it should be able to read in parallel and achieve
Raid0 read speeds for the single file. It does nothing for copying a full drive
or several files in sequence though.
 
F

Folkert Rienstra

Rod Speed said:
Irrelevant to that point I was making.

Exactly. (Since you didn't make any).
But not with LONG SEQUENTIAL READS getting the
2 times performance that is theoretically possible if
the raid system can work out that its going to be
getting a request for that long sequential read.

There is no such thing as a single 'request for that long sequential read'.
It's several (unless you consider 128kB to be long).
It isnt even viable to read ahead in the hope that
the as yet unrequested stuff will be requested later,
basically because there is no where to put that read ahead.

Completely missed the point. Nothing to do with cache.
Everything to do with spreading requests in such a way
that two sequential reads are performed in parallel.
 
R

Rod Speed

Some ****wit pseudokraut that has run out of its meds,
usual pathetic excuse for a troll that's all it can ever manage.
 
A

Antoine Leca

In Folkert Rienstra va escriure:
Antoine Leca said:
With RAID 1, you cannot make them contiguous or near enough for
both writing [every sector] and reading [every two sectors].

So that's obviously not the way to do it.

Sorry, you are too terse for me. Which "way" are you writing about?

That's silly.

May I ask you how it should be done then?

You divide a single request into two and send each half to a
different drive. With consequtive requests (for a sequential file)
you sent half the number of total requests to one drive and the
other half to the other one, resulting in both drives reading
sequentially.

Sure, sorry I had assumed this explanation.
And since the media have been written sequentially (I assume it, but I'd
happy to learn a better way), if you are reading all the even-numbered
(resp. odd-numbered) sectors, so the disk controler have to wait for the
media a small bit between each sector to be transferred, which should give a
small penalty with respect to a non-RAID configuration (assuming media
access is the bottleneck, that is).

That's not a perceptive gain.

/Quite/ possible ;-).


Antoine
 
Ad

Advertisements

F

Folkert Rienstra

Antoine Leca said:
In Folkert Rienstra va escriure:
"Antoine Leca" (e-mail address removed)> wrote in message news:[email protected]
With RAID 1, you cannot make them contiguous or near enough for
both writing [every sector] and reading [every two sectors].

So that's obviously not the way to do it.

Sorry, you are too terse for me. Which "way" are you writing about?

You ask me to explain your own writing?
May I ask you how it should be done then?

No. You can read the paragraph below instead.
Sure, sorry I had assumed this explanation.

Uh, what?
And since the media have been written sequentially (I assume it, but I'd
happy to learn a better way), if you are reading all the even-numbered
(resp. odd-numbered) sectors, so the disk controler have to wait for the
media a small bit between each sector to be transferred,

You need to get rid of that even/odd numbered sectors fascination and start
thinking in disk IO-commands, the way I described in the previous paragraph.
which should give a small penalty

Nope, that's a big penalty.
Reading half the sectors per time unit obviously get's you half the
throughput. Combined with the other drive you will get your 100% back.
 
M

Mike Tomlinson

Arno Wagner said:
Very true!

Unless you have a crappy RAID controller that mirrors a failing drive
onto its working neighbour, losing all the data. Which has happened to
me. Tape backup saved my skin on that occasion.
 
G

Gerhard Fiedler

Mike said:
Unless you have a crappy RAID controller that mirrors a failing drive
onto its working neighbour, losing all the data. Which has happened to
me. Tape backup saved my skin on that occasion.

Don't they send the write commands to two drives rather than mirror one
drive to the other?

Gerhard
 
A

Arno Wagner

Previously Mike Tomlinson said:
Unless you have a crappy RAID controller that mirrors a failing drive
onto its working neighbour, losing all the data. Which has happened to
me.

Ouch! That hurts! Pretty incompetent design. Care to give a brand
and model?
Tape backup saved my skin on that occasion.

Well, nobody competent thinks RAID is a replacement for backup.
It will just make the likelyness of the backup being needed
significantly smaller and saves time that way. Of course
backup is still non-optimal for all important data.

Arno
 
Ad

Advertisements

A

Antoine Leca

In Folkert Rienstra va escriure:
Antoine Leca said:
In Folkert Rienstra va escriure:
"Antoine Leca" (e-mail address removed)> wrote in message
news:[email protected]
With RAID 1, you cannot make them contiguous or near enough for
both writing [every sector] and reading [every two sectors].

So that's obviously not the way to do it.

Sorry, you are too terse for me. Which "way" are you writing about?

You ask me to explain your own writing?

I am sorry it appears I also was too terse...
No, I believe I understand what I wanted to say ;-) (another thing is the
meaning of what I actually wrote :), but it was not the purpose of my
question above).

I was asking, since you wrote my sentence (describing just an alternative)
was "obviously not the way to do it", what should be "the way to do [RAID 1
data organization on disk]."

[...]
You need to get rid of that even/odd numbered sectors fascination and
start thinking in disk IO-commands, the way I described in the previous
paragraph.

Disk commands are interesting to understand the pros and cons of an
interface implementation (and I agree I missed something here on the first
shot; I'd thank here all that provided me detailled explanations and
commentaries, by the way.)


However, only considering the interface will drive us into a perfect world
where the disk provides the datas to the controller and up without any
delay; and we all know this perfect world is not yet the one we live into.

So I was _also_ considering the physical organization of the sectors on the
media. And my idea is that the write operation on the RAID controller will
not do anything special (in other words, the RAID controller will issue the
same I/O command to both "disks" as what it received from the upper level,
including same sector numbers); while the read operation on the RAID
controller will be splitted as you explained (thanks.)

Now, when you look at a lower level ("disk" within quotes above), my idea is
that the write operation accesses the media without lost time for a
sequential run of sectors ; this in turn means that the read operation
(which, as seen by the disk, only is about the even- resp. odd-numbered
sectors, as passed inside the I/O commandes received).

And my (perhaps faulty) conclusion was that this represented a small penalty
(with respect to e.g. RAID 0, where the traffic of I/O commands while
reading sequencially is essentially the same), because the read operations
have to skip some sectors while accessing the media.


Antoine
 
F

Folkert Rienstra

Antoine Leca said:
In Folkert Rienstra va escriure:
"Antoine Leca" (e-mail address removed)> wrote in message news:43940b8b$0$6485$[email protected]
In Folkert Rienstra va escriure:
"Antoine Leca" (e-mail address removed)> wrote in message
With RAID 1, you cannot make them contiguous or near enough for
both writing [every sector] and reading [every two sectors].

So that's obviously not the way to do it.

Sorry, you are too terse for me. Which "way" are you writing about?

You ask me to explain your own writing?

I am sorry it appears I also was too terse...
No, I believe I understand what I wanted to say ;-) (another thing is the
meaning of what I actually wrote :), but it was not the purpose of my
question above).

I was asking, since you wrote my sentence (describing just an alternative)
was "obviously not the way to do it", what should be
"the way to do [RAID 1 data organization on disk]."

And my answer was to leave that alone and to stop thinking about alternate
sector reading and instead go by dividing single sequential transfers into
2 sequential transfers of half size that are executed in parallel.
You divide a single request into two and send each half to a
different drive. With consequtive requests (for a sequential file)
you sent half the number of total requests to one drive and the
other half to the other one, resulting in both drives reading
sequentially.
[...]
You need to get rid of that even/odd numbered sectors fascination and
start thinking in disk IO-commands, the way I described in the previous
paragraph.

Disk commands are interesting to understand the pros and cons of an inter-
face implementation (and I agree I missed something here on the first shot;

I'm still not convinced that you got it now.
I'd thank here all that provided me detailled explanations and commentaries,
by the way.)


However, only considering the interface will drive us into a perfect world
where the disk provides the datas to the controller and up without any delay;

It does, as far as comparing RAID vs non-RAID, leaving access time out of
the equasion.
and we all know this perfect world is not yet the one we live into.

As long as an IO is issued before it's start sector passes the head there will
be no delays.
However reading only even or uneven sectors has only a 50% effectiveness
as described in the part that you snipped.
So I was _also_ considering the physical organization of the sectors on the
media. And my idea is that the write operation on the RAID controller will
not do anything special (in other words, the RAID controller will issue the
same I/O command to both "disks" as what it received from the upper level,
including same sector numbers); while the read operation on the RAID
controller will be splitted as you explained (thanks.)

Which has nothing to do with a changed on-disk organization.
Now, when you look at a lower level ("disk" within quotes above), my idea
is that the write operation accesses the media without lost time for a
sequential run of sectors ;
this in turn means

No, it doesn't.
that the read operation (which, as seen by the disk, only is about the even-
resp. odd-numbered sectors, as passed inside the I/O commandes received).

Repeat: That is very ineffective.
Not only 1 sector per IO (high overhead on command vs data transferred, slow)
but also reading at half the normal data rate.
And my (perhaps faulty) conclusion was that this represented a small penalty
(with respect to e.g. RAID 0,

No, you said "non-Raid".
where the traffic of I/O commands while reading sequencially is essentially the same),

Go back and read what I said about that.
because the read operations have to skip some sectors while accessing the media.

That reminds me of Danyel Goodwin who once maintained that drives could read
slightly faster than they could write because they could skip the heads settling
field between header and data while they were reading.
As if the drive could jump ahead in time, skipping that field.

Sectors pass as slow or as fast as normal, whether you read them or not.
They take their usual time to pass. Read half of them, get half the throughput.
 
Ad

Advertisements

A

Antoine Leca

To make a long story short:

In Folkert Rienstra va escriure:
No, you said "non-Raid".

Okay, I found the origin of the misunderstanding.

I was specifically referring to RAID variations:
] The difference between RAID 0 and RAID 1 is the distance between one
stripe
] and the next on disk.

I made a little mistake while replying: I wrote
] [...] which should give a small penalty with respect to a non-RAID
] configuration (assuming media access is the bottleneck, that is).
I meant "non-RAID1" but did not make it clear. It clearly should have been
``contiguous reading'' (or plain ``RAID-0'' if you prefer), would have been
much clearer.
I am sorry for the mistake.

OTOH, I basically assumed a 1-sector stripe (which is not real.)
Or that the stripe size is lower than available free cache on the RAID
controller (and we hope so.)

Read half of them, get half the throughput.

My point, exactly.
Since there is two drives, we end up at twice this throughput when data are
delivered to the main system by the RAID controller, and cannot make better
while reading sequencially.

If you compare to a non-RAID setup, RAID-1 might result in a better
throughput, for example if the connection link between the controller and
the drive is a limiting factor (ATA33 IIRC).

However, in the common case where the limiting factor is media access,
RAID-1 will result in about the same figures as non-RAID when seen from
outside (as you explained long and wide, thanks.)

OTOH, RAID-0 may deliver an higher throughput, because it can use more of
the available bandwidth on the connection link.


Antoine
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top