Optimal Buffer Size To Read A File

A

Amy L.

Is there a Buffer size that is optimial based on the framework or OS that is
optimal when working with chunks of data? Also, is there a point where the
buffer size might not be optimal (too large)? I am considering an 8K or 16K
Buffer. The files sizes are random but range between 8K - 100K with the
occasional files being several megs.

Example:

int _READBUFFER_ = 1024 ;
fi = new FileInfo( args[ 0 ] ) ;
FileStream fs = fi.OpenRead() ;
while ( fs.Read( ByteArray, 0, _READBUFFER_ ) > 0 )
{
myStringBuilder.Append( textConverter.GetString( ByteArray ) ) ;
}

Thanks,
Amy
 
J

Jon Skeet [C# MVP]

Amy L. said:
Is there a Buffer size that is optimial based on the framework or OS that is
optimal when working with chunks of data? Also, is there a point where the
buffer size might not be optimal (too large)? I am considering an 8K or 16K
Buffer. The files sizes are random but range between 8K - 100K with the
occasional files being several megs.

Example:

int _READBUFFER_ = 1024 ;
fi = new FileInfo( args[ 0 ] ) ;
FileStream fs = fi.OpenRead() ;
while ( fs.Read( ByteArray, 0, _READBUFFER_ ) > 0 )
{
myStringBuilder.Append( textConverter.GetString( ByteArray ) ) ;
}

"Optimal" depends on where you draw the line between time and memory.
The larger the buffer, the faster - but the more expensive in terms of
memory.

Note that your code above is broken, as it doesn't use the value
returned by Read to make sure that only that amount of data is
converted by the text converter.
 
W

William Stacey [MVP]

Everything else being equal, reading in multiples of your NTFS allocation
unit size (i.e. cluster size) is probably the most efficient. For a 4K
allocation size, for example, your file size will be multiples of 4K with
minimum being 4K even if only 1 byte. So for a 5 unit file ( or 5 X 4K =
20K), posting a 20K would probably be best from a strictly HD read
perspective as the driver could potentially post all 5 reads in a row and
read the whole file in one swath. In real world, things are not that
simple. You have other stuff going on contenting for reads/writes, driver
optimizations, hw optimizations (e.g. RAID), and the list is long. If you
reading a file and writing it to disk or network, I tend to favor 4K if your
allocation size is 4K or a slightly larger multiple, so 8K should be fine as
well. Testing perf for your specific app could help find the right size.
Cheers.
 
A

Amy L.

Jon,

Thank you for the follow-up and pointing out the issue with not using the
value from read in the text conversion process.

Amy.

Jon Skeet said:
Amy L. said:
Is there a Buffer size that is optimial based on the framework or OS that is
optimal when working with chunks of data? Also, is there a point where the
buffer size might not be optimal (too large)? I am considering an 8K or 16K
Buffer. The files sizes are random but range between 8K - 100K with the
occasional files being several megs.

Example:

int _READBUFFER_ = 1024 ;
fi = new FileInfo( args[ 0 ] ) ;
FileStream fs = fi.OpenRead() ;
while ( fs.Read( ByteArray, 0, _READBUFFER_ ) > 0 )
{
myStringBuilder.Append( textConverter.GetString( ByteArray ) ) ;
}

"Optimal" depends on where you draw the line between time and memory.
The larger the buffer, the faster - but the more expensive in terms of
memory.

Note that your code above is broken, as it doesn't use the value
returned by Read to make sure that only that amount of data is
converted by the text converter.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top