File reading fast (180 MB/sec), File writing slow (4 MB/sec) ?

S

Skybuck Flying

Hi,

I made two simple harddisk benchmark's.

One for reading data from a file and one for writing data to a file.

I just use Delphi's TfileStream, which in it turns uses the win32 api.

The harddisk benchmark shows that my harddisk is able to read 180 MB/sec
from a file which is 36 MB.

The other harddisk benchmark shows that my harddisk is able to write 4
MB/sec to the same file which is 36 MB.
( overwriting it ).

So here is a huge difference in speeds. Which leads me to wonder what is
causing this huge difference.

The access is done sequentially in both tests... (reading and writing).

I am guessing that the writing is slow because it has to change the NTFS
file allocation table ? So then the harddisk write head has to move
constantly... but why is the reading so fast ? Maybe the reading does not
have to constantly move it's head... each cluster could have a next pointer
where the next data is te be read... that could explain it...

Ofcourse I am just guessing here...

Anybody know ?

Bye,
Skybuck.
 
B

Bob Willard

Skybuck said:
Hi,

I made two simple harddisk benchmark's.

One for reading data from a file and one for writing data to a file.

I just use Delphi's TfileStream, which in it turns uses the win32 api.

The harddisk benchmark shows that my harddisk is able to read 180 MB/sec
from a file which is 36 MB.

The other harddisk benchmark shows that my harddisk is able to write 4
MB/sec to the same file which is 36 MB.
( overwriting it ).

So here is a huge difference in speeds. Which leads me to wonder what is
causing this huge difference.

The access is done sequentially in both tests... (reading and writing).

I am guessing that the writing is slow because it has to change the NTFS
file allocation table ? So then the harddisk write head has to move
constantly... but why is the reading so fast ? Maybe the reading does not
have to constantly move it's head... each cluster could have a next pointer
where the next data is te be read... that could explain it...

Ofcourse I am just guessing here...

Anybody know ?

Bye,
Skybuck.

180 MB/s vastly exceeds the capability of any single HD. That kind
of read speed can only come from re-reading data that is already in
the OS cache.

If you want a real HD benchmark, I suggest HDtach. Most so-called
disk benchmarks are really filesystem benchmarks, with all the
misleading results such as you reported.
 
S

Skybuck Flying

Bob Willard said:
180 MB/s vastly exceeds the capability of any single HD. That kind
of read speed can only come from re-reading data that is already in
the OS cache.

If you want a real HD benchmark, I suggest HDtach. Most so-called
disk benchmarks are really filesystem benchmarks, with all the
misleading results such as you reported.

Indeed I tested again this time on a larger file... the read speed was only
4 MB/sec.

Another person tested the same file and got 40 to 80 MB/sec.

It probably depends on the file being fragmented or not :D
 
B

Bob Willard

Skybuck said:
Actually I am not really interested in the harddisk speed itself.

I am interested in how fast it will read a file.

So that includes the file system.

Also how stable the transfer speed is.

Windows has two ways of doing I/O... synchronious en asynchornious... .lik
i/o overlapped etc... maybe that makes a difference too.?

Sure should. Asynchronous I/O offers the opportunity for much faster
I/O, but the app must be coded to take advantage of it by issuing
multiple, overlapped, I/O requests. How much gain you get depends --
on everything. SCSI HDs all support asynchronous I/O; a few IDE HDs
do, but most do not; I don't know about the current crop of SATA HDs.
If you have multiple HDs then you have the opportunity for serious
gains with async. I/O, but it is frequently rather difficult to
balance the app well enough to get a gain of ~N with N HDs.
 
S

Skybuck Flying

Bob Willard said:
Sure should. Asynchronous I/O offers the opportunity for much faster
I/O, but the app must be coded to take advantage of it by issuing
multiple, overlapped, I/O requests. How much gain you get depends --
on everything. SCSI HDs all support asynchronous I/O; a few IDE HDs
do, but most do not; I don't know about the current crop of SATA HDs.
If you have multiple HDs then you have the opportunity for serious
gains with async. I/O, but it is frequently rather difficult to
balance the app well enough to get a gain of ~N with N HDs.

Can you explain why overlapped i/o would be faster than synchronous i/o ?

I can imagine the OS being smart and 'detecting' that multiple request are
close by and therefore buffering more.

Or I can imagine the OS starting to behave more out of order... when the
harddisk head moves over the requested block
you will get it... not sooner.... so the OS moves the harddisk head as
little as possible... and when it's time you ll get it :D

Ofcourse this is just me guessing why I could be faster...
 
B

Bob Willard

Skybuck said:
Can you explain why overlapped i/o would be faster than synchronous i/o ?

I can imagine the OS being smart and 'detecting' that multiple request are
close by and therefore buffering more.

Or I can imagine the OS starting to behave more out of order... when the
harddisk head moves over the requested block
you will get it... not sooner.... so the OS moves the harddisk head as
little as possible... and when it's time you ll get it :D

Ofcourse this is just me guessing why I could be faster...

Best case, if the entire path between your app and your SCSI HD is
async., will be that your app issues a bunch of I/O disk R/W commands
which the OS will hand off to the HD. A SCSI HD will (usually)
choose the commands to execute in a seek-optimized order, rather
than in order of issuance, thus minimizing time spent seeking.

Or, if your PC has multiple HDs then the OS may pass one or more
commands to each HD, so that they seek and (in some cases) transfer
data in parallel.

Even if you only have a single dumb (most are) IDE HD, then async. I/O
overlaps the issuance of commands from your app to the OS and the
issuance of responses from the OS to your app.

Async I/O may be a recent addition to WinWhatever, but the technique
has been used in other OSs for decades.
 
S

Skybuck Flying

Bob Willard said:
Best case, if the entire path between your app and your SCSI HD is
async., will be that your app issues a bunch of I/O disk R/W commands
which the OS will hand off to the HD. A SCSI HD will (usually)
choose the commands to execute in a seek-optimized order, rather
than in order of issuance, thus minimizing time spent seeking.

Or, if your PC has multiple HDs then the OS may pass one or more
commands to each HD, so that they seek and (in some cases) transfer
data in parallel.

Even if you only have a single dumb (most are) IDE HD, then async. I/O
overlaps the issuance of commands from your app to the OS and the
issuance of responses from the OS to your app.

Async I/O may be a recent addition to WinWhatever, but the technique
has been used in other OSs for decades.

Hmmm... maybe borland will someday expanded the TfileStream classes to allow
async i/o as well :D
 
B

Bob Willard

Skybuck said:
I also wonder much slower or faster overlapped I/O might be.

I think not much ?

I do think it might save processor time though.

And the answer is ... it depends. There are zillions of different
combinations of OS, CPU, HD(s) (including with and without the
various flavors of RAID implemented differently by different RAID
vendors), HD-memory paths (SATA, PATA, SCSI, net-mapped, etc.),
pagefile params, cache params, <blah, blah, blah>. The value of
async. I/O varies a lot, so you might want to pick the cases that
matter to you and measure the difference.
 
S

Skybuck Flying

Bob Willard said:
And the answer is ... it depends. There are zillions of different
combinations of OS, CPU, HD(s) (including with and without the
various flavors of RAID implemented differently by different RAID
vendors), HD-memory paths (SATA, PATA, SCSI, net-mapped, etc.),
pagefile params, cache params, <blah, blah, blah>. The value of
async. I/O varies a lot, so you might want to pick the cases that
matter to you and measure the difference.

Yep, that's exactly what I will do... measure the difference for my case.

Yet I dont have much experience yet with writing software that uses async
I/O methods.

My idea at this point in time is simple, I make a thread like this:

// create 1000 buffers

// create a free buffer collection (whatever you wanna call it)

// create a filled buffer collection

// put the 1000 buffers into free buffer collection.

// open file etc

// thread loop start

// supply all free buffers to the async i/o api.

// sleepex(0), set the thread to an alerabtle state, and use all time.

// ( filled buffers would be processed here )

// put the buffer back into the free buffers

// thread loop end

// close file etc

// destroy buffers etc.

// destroy collections etc.

---

i/o completion routine:

// start

// put the filled buffer into the filled buffer collection etc.

// end

Ok... that s the basic idea... it's a lot more complex than just a simple
file stream... I do believe it might be worth it.

The only thing I am not sure about at this point is the sleepex(0)...

Should I set it to zero... or another value ?

If I set it to another value I think the thread will only process one buffer
and then give control to other threads and that might be bad for
performance.

Cheers,
Skybuck :D
 
S

Skybuck Flying

Actually I forgot to mention why I think it would be worth it.

I think it would be worth it because it would use less cpu.

I believe that with sync the cpu is waiting for the hd to finish it's task
and that is very bad very the rest of the system....

The whole system blocks...

With async... the cpu should be able to just continue to do other tasks...

That s why I think it would be worth it.
 
B

Bob Willard

Skybuck said:
depends --



Actually now I believe it s the other way around, hehe.

"A thread can relinquish the remainder of its time slice by calling this
function with a sleep time of zero milliseconds. "

Unless I read something somewhere else specifieing different behaviour.

Relinquish means to abandon or give up... in this case it s time slice.

So sleepex(1) or just sleepex(10) or even sleepex(1000) might be fine too.

sleepex will be interrupted anyway when something happens :D

sleepex(1) will probably take a lot more cpu...
sleepex(1000) would probably be best... however then it might take 1 second
to stop the thread... which is not to bad.

sleepex(0) will probably cause to many context switching etc... causing bad
performance.

Sorry, but I can't help you with sleepex. And, this thread is getting
into more detail than seems appropriate in a NG.

So, 10-4 and good luck.
 
S

Skybuck Flying

Well..

I just finished testing some async i/o...

The only noticable difference on windows xp is when the thread stops... it
stops faster...

One more thing which is funny... windows 98 doesn't support overlapped file
i/o at all..

So this will probably end my async i/o adventure :D
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top