I/O buffering

B

beginwithl

hi

1)
From MSDN site:
“FileStream buffers input and output for better performance.”

a) So essentially, FileStream is buffered class?! But if that is the
case, then why do bytes get written to the underlying file
immediately? In other words, even if we never call Flush() or Close(),
bytes do get written to a file?

b) Does FileStream also buffer input bytes? If so, then what exactly
does that mean? That bytes may not get immediately read into, say, byte
[] variable?

c) Does Flush() also work when reading from a file? Meaning it somehow
forces bytes buffered inside FileStream to get written into variable?




2)
a) StreamReader also buffers data?

b) I assume StreamWriter also buffers data internally, waiting for a
buffer to be full before it gets written to the underlying stream?




3) But if both StreamWriter and FileReader already buffer data
internally, then why would we also need special classes for
buffering?





4) I assume that if WriteStream ( or any other stream buffer ) object
buffers data, then data will only get written to a file when we either
call Flush(), Close() or when internal buffer is full?





5)

a) Does OS( or its disk buffer) also buffer data from our streams?

b) If so, then I assume that calling Flush() from any of the I/O
objects ( FileStream or StreamWriter etc ) forces the underlying OS
buffer to flush its contets also?

c) Is it possible that we would write some text to a disk ( via
StreamWriter class), but since we’d never call Flush() or Close() on
that stream, that this text would get buffered by a disk buffer, which
would only write it to a file at some later time ( perhaps when our
app has already stopped )?





6)

static void Main()
{
FileStream fs = new FileStream(@"D:\test.txt",
FileMode.Create);
StreamWriter sw = new StreamWriter(fs);
StreamWriter sw1 = new StreamWriter(fs);


sw.Write("sw");
sw1.Write("sw1");

fs.Close();
sw.Close(); // exception
sw1.Close(); // exception
}

a) It seems that closing ‘fs’ doesn’t automatically close its wrapping
classes ‘sw’ and ‘sw1’ ( I’m assuming this because if ‘sw’ and ‘sw1’
were also closed, then the two would flush their buffered data to the
underlying stream )? So how do we close ‘sw1’ and ‘sw’, if closing
them ( when we already closed underlying stream ) causes “cannot
access closed file” exception to be thrown? We could, instead of
closing ‘fs’ directly, close one of its wrapper classes, say ‘sw’. But
we’d still need to somehow close ‘sw1’?!





7) What does closing a stream mean? Just that OS releases the
resources previouslly occupied by a stream ( lock on files, etc )?



thank you
 
B

beginwithl

hi


On Wed, 04 Feb 2009 15:45:23 -0800, Michael D. Ober



What's "FileReader"? Do you mean "StreamReader"?

Who says you "also need special classes for buffering"? FileStream just
happens to use buffering as part of its normal operation, for efficiency
reasons. You can actually disable the buffering if you really want to
(specify an explicit buffer size of 0), but in any case, FileStream isn't
a "special class for buffering", it's a special class for file i/o that
just happens to have buffering.

a) I didn’t mean to imply that FileStream is special class for
buffering, but since FileStream, StreamWriter and StreamReader all
have buffering capabilities, why does Net also have special buffering
classes ( those with word Buffer in their name )?

b) In any case, based on what criteria does FileStream object chooses
whether it should write data directly to a file or buffer it instead
( so far it always seemed to write it directly to a file )?

Dispose() should always flush the data. I don't know what "WriteStream"
is, but assuming it's a typo for "StreamWriter", then disposing a
StreamWriter will flush the StreamWriter buffers as well as the underlying
Stream.
Ah, so StreamWriter.Flush does also flush the underlying stream
buffers



No. Because the question is invalid, the answer must be "no". If you
never call Flush() on FileStream (for example), the data that's still
stuck in the FileStream's buffers would _not_ "get buffered by a disk
buffer". It wouldn't get that far; that's the whole point of calling
Flush(), to make sure it does get that far.
But data also gets automatically flushed when FileStream’s buffer is
full?!

Now, if you _do_ call Flush(), there is in fact a possibility that the
buffers from the OS object on (i.e. out of FileStream's control) wouldn't
be written immediately and would be written at some later time, even after
your application has stopped.

What are the chances of that happening ( small, I hope )?
static void Main()
{
FileStream fs = new FileStream(@"D:\test.txt",
FileMode.Create);
StreamWriter sw = new StreamWriter(fs);
StreamWriter sw1 = new StreamWriter(fs);

fs.Close();
sw.Close(); // exception
sw1.Close(); // exception
}
a) It seems that closing ‘fs’ doesn’t automatically close its wrapping
classes ‘sw’ and ‘sw1’ ( I’m assuming this because if ‘sw’ and ‘sw1’
were also closed, then the two would flush their buffered data to the
underlying stream )?

No, it doesn't. A Stream inside a StreamWriter knows nothing about the
StreamWriter.
So how do we close ‘sw1’ and ‘sw’, if closing
them ( when we already closed underlying stream ) causes “cannot
access closed file” exception to be thrown?

You don't. It's a good reason not to have a given Stream attached to more
than one StreamWriter at a time.

Why would having two wrapper classes be considered a bad idea
( besides the problems with Close() )?
At best, you can call Flush() on both StreamWriter's, and then close them
(catching and ignoring the exception on the second one you close).

But does calling Close() on sw1 has any effect at all, since exception
being thrown suggests that sw1.Close() failed at whatever it was
trying to do?!
Yes, you should (though in reality, probably not much harm would come from
failing to).

* By “Yes, you should” were you referring to closing 'sw' or 'sw1' or
both?

* Thus,were you implying that there would be no harm if I failed to
close either of the two?

* Why would there be no harm? Due to the fact that by closing
underlying stream we already released the file resources?

In reality, the answer is "don't do that". It's an ugly way
to use StreamWriter. If you really must attach two StreamWriter's to the
same FileStream, you should create an intermediary Stream sub-class that
itself wraps FileStream, and which is used to create each StreamWriter,
and which ignores the close/dispose from the StreamWriters (you would of
course have to close the FileStream explicitly yourself, _after_ the
StreamWriter's have been closed).

* So this subclass would only call FileStream when both of
StreamWriters called Close/Dispose ( when first StreamWriter would
call Close(), it would ignore it, but when second also would call it,
then it would close the underlying stream? )?

* But how would this Stream sub-class prevent StreamWriters from
calling Close() directly on underlying FileStream?




This is a bit off topic:

From MSDN:
“Be sure to call the Dispose method on all FileStream objects”

a) Why? Don’t you also clean up the resources or at least release the
file handle by simply calling FileStream.Close()?

b) So you should always call FileStream.Dispose() instead of
FileStream.Close()? Book hasn’t mention anything like tha and in fact
always used Close() instead. Uh!

c) I haven’t bothered learning Dispose() simply due to the fact that
may book claims one only needs it when dealing with unmanaged
resources. But since you should always call Dispose() on FileStream
object, I assume FileStream handles unmanaged resources and as such it
needs to call Dispose() to release file handles and locks as soon as
possible?

d) But then, don’t most .Net classes call Win32 API functions? Thus,
shouldn’t you then call Dispose() on most of Net objects?


e)

* StreamWriter also has a Dispose() method. But since it is just a
wrapper class, I’d assume it doesn’t deal with unmanaged resources, so
why need to call it? Perhaps it calls the Dispose() of the underlying
Stream object?

* Thus, I should also always call StreamWriter.Dispose() instead of
StreamWriter.Close()?

At least in cases where I haven’t called FileStream.Dispose()
directly?


BTW - I assume ‘StreamWriter buffering data’ means that when it reads
from a file, it reads more characters than requested by Read() and
thus it doesn’t have to make so many calls to the underlying system?!


I appreciate it
 
B

beginwithl

hi


Without a specific example, it's impossible to say.  But, as an example:  
there's a BufferedStream class.  As I mentioned, not all Stream classes 
buffer.  But, sometimes it's useful to add buffering to a non-buffering 
Stream class.  Using BufferedStream is a way to do that.

One would not normally use BufferedStream with a FileStream, since that  
would be pointless.  But you might, for example, use it with a  
NetworkStream, or some other Stream implementation that doesn't itself  
offer buffering functionality.  Note, of course, in some sense there's  
pretty much no such thing as an i/o class that doesn't do buffering.  Even  
with NetworkStream, which doesn't internally do buffering, there are  
buffers _somewhere_.


It always buffers.  I was incorrect previously in stating that you could  
specify a buffer size of 0.  That's an illegal value for the buffer size  
for FileStream.

If it always buffers, then it pretty quickly gets flushed since in my
experience it always almost immediately writes it to the file. I
assume that is due to having pretty small default buffer size?!
Yes.  I think the docs are bit weak in this area, but flushing the  
StreamWriter also flushes the underlying stream.


Of course.  Otherwise how would the buffer be made empty so that new data  
could be added?



The chances of the delay in writing to the physical media are quite good, 
actually.  But, the delay is in practice fairly short and only a complete  
system failure (e.g. lose power) before the data is finally written would 
produce any observable consequences.

So, don't pull the plug on your PC at the same time your program is  
flushing data.


Because that's the nature of wrappers.  Once you've wrapped an instanceof  
something in another class, that other class owns that instance.  If you  
then try to wrap it in yet another instance of a class, then you are  
basically saying your wrapped object has two owners.

In certain cases, of course, multiple ownership is valid.  But it has to  
be prearranged explicitly, and supported by the classes involved.  
Otherwise, single ownership is the assumed standard.

There are specific reasons this is especially important in the case of  
StreamWriter and StreamReader, but really these just follow the more  
general rule.  Those specific reasons are just examples of why the rule 
exists.
The specific reasons being ones we discussed above ( the second
StreamWriter trying to call Close() ( when the first one already
called Close ) causes an exception to be thrown ), or are there also
any other reasons ( I know it has to do with ownership )?

Very little effect.  The primary purpose of Close() is to close the  
underlying stream.



As an answer to a question, it applies to the question.


I implied no such thing.  I _stated_ that "not much harm would come".  
"Not much harm" is not the same as "no harm".


Two reasons: first, yes...if you've closed the stream, you've already  
accomplished the main thing that closing the writer(s) would accomplish.  
Second, you have very good chances (but no guarantees) that the finalizer 
will be executed and perform the work your explicit call to Close() would 
have done.

These are not reasons to ignore the advice to do things correctly.  I'm 
just saying that of the sins you might commit, this is down on the more  
benign end of the scale.



No.  It would always ignore a Close().  Your own code would have to  
explicitly dispose your custom Stream class, when you were sure you were  
done with it.


How would StreamWriters know about the underlying FileStream?  Where would  
they get that reference?



A quote out of context is almost never very useful for discussion.

That said, some points:

     -- If an object implements IDisposable, you should always call 
Dispose() when you're done.
     -- You don't necessarily call Dispose() explicitly.  There are ways  
that Dispose() can be called implicitly, and this meets the requirements  
of a statement like what you've quoted.

In case of stream-related classes we implicitly call Dispose() by
calling Close()?!
     -- For the stream-related classes, Close() and Dispose() are basically  
equivalent.  Stream even specifically documents this.  The only thing 
Close() is supposed to do is call Dispose().


See above.


You can use either or both.  I personally go back and forth; conceptually,  
I feel that one should do both.  But on the other hand, MSDN is pretty  
clear about the equivalence of the two, and Dispose() is generally a lot  
more convenient to call correctly, due to the "using" statement.


Sure.  It doesn't really matter why an object implements IDisposable.  If  
it does, you need to call Dispose() on it when you're done with it.


Calling a Win32 function isn't what creates the need for disposal.  It's  
the use of data structures that can't be tracked using the managed memory 
management.  I.e. "unmanaged resources".
Uhm, I know we’ve already talked about unmanaged resources ( UR ), but
something’s still bothering me. I realize they are data structures
that aren’t under the control of GC. Anyways, my little assumption on
why not cleaning up unmanaged resources would be bad:

• URs may be in possession of something, be it a file or network
connection,
and as such other objects/apps can’t access that something
• or they represent an open door to something they “hold/are connected
to” to
uninvited guests ( such as hackers )
• and of course they cause leaking

?

Regardless, the rule is simple: you should be calling Dispose() on any  
object that implements IDisposable.  And of course, you can't call  
Dispose() on any other object.



It calls Close() on the underlying Stream object.  Which as you know now  
is the same as calling Dispose().


Either is fine.


Once your FileStream instance has been wrapped in a StreamWriter, you  
should not do anything else with it.  StreamWriter owns it, and you need  
to operate on StreamWriter instead.

As with any rule, there are always exceptions to the rule.  But it's a  
good rule, nonetheless.


Again, quotes out of context are not very useful for discussion.
Sorry about that. What I meant to ask:

Since StreamReader also buffers data, I was wondering what exactly
does that mean in the case of StreamReader.
I assume that when it reads from a file, it reads more characters
than requested by Read() and thus it doesn’t have to make so many
calls to the underlying system and the next time Read() is called, it
could perhaps simply return the result from its internal buffer
without making a call to the underlying stream?


thanx Pete
 
B

beginwithl

hi


It's impossible to comment on the exact behavior without a precise code
example. But, if you see data getting written to the underlying file from
FileStream, there's only one reason that could happen: more data was
written than could fit in the buffer.

Sometimes this is because the buffer was actually full. Sometimes it's
because the number of bytes written exceeds the remaining available space
in the buffer. Of course, if the write is larger than the buffer itself,
there's no point in buffering at all, so obviously in that case, the
entire write gets written to the file.

Note that FileStream isn't like some of the other buffers in the OS.
Flushing the contents of the buffer happens only when the FileStream
instance is actually explicitly used; if you don't call one of the
specific methods on the instance that can flush data, the buffer won't
ever be flushed.

I'm a bit confused about the following:

"if you don't call one of the specific methods on the instance that
can flush data"

By “instance” you are also referring to its wrapper class ( such as
StreamWriter )?

"if you don't call one of the specific methods on the instance that
can flush data, the buffer won't ever be flushed."

What specific methods have you had in mind? If you call Flush() on its
wrapper class, then FileStream buffer will also get flushed. If you
call StreamWriter.Write( X ) and if size of X exceeds size of
StreamWriter buffer or if X causes buffer to be full, then
StreamWriter buffer will be flushed and consequently so will be
FileStream buffer.

Thus, I’m not sure what you meant by “if one doesn’t call specific
methods, then buffer won’t ever be flushed” ?


cheers
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top