StreamReader / StreamWriter vs System.IO.File.*?

  • Thread starter Thread starter Zytan
  • Start date Start date
Z

Zytan

I have installed the Visual C# 2005 Code Snippets, and the snippets
for file handling use StreamReader and StreamWriter, instead of
System.IO.File.*. VB 2005 code snippets don't use these. Why are
they suggesting the use of these classes instead of the fundamental
File methods? Are they better? I thought code snippets were supposed
show things in the easy / simple way.

What do you guys think? What should a beginner learn to do when
learning to read / write files?

Zytan
 
the snippets
for file handling use StreamReader and StreamWriter

And they even use the using-statement! Which is hard to grasp quickly
(especially when the docs show it using a resource created before the
using statement block, which is strange since it is supposed to
destruct that object at the end of the block, but how can it do this
when the object it is *still in scope* after the using statement
block!?)
I thought code snippets were supposed
show things in the easy / simple way.

So, these snippets are certainly not the easiest way.

Zytan
 
Zytan said:
I have installed the Visual C# 2005 Code Snippets, and the snippets
for file handling use StreamReader and StreamWriter, instead of
System.IO.File.*. VB 2005 code snippets don't use these. Why are
they suggesting the use of these classes instead of the fundamental
File methods? Are they better? I thought code snippets were supposed
show things in the easy / simple way.

What do you guys think? What should a beginner learn to do when
learning to read / write files?

I do not have code snippets, but I think you can use
the following rule of thumb:

*Stream for binary files
*Writer/*Reader for text files (ignoring BinaryReader/BinaryWriter)

*Writer/*Reader are additional functionality on top of *Stream.

Arne
 
Zytan said:
And they even use the using-statement! Which is hard to grasp quickly
(especially when the docs show it using a resource created before the
using statement block, which is strange since it is supposed to
destruct that object at the end of the block, but how can it do this
when the object it is *still in scope* after the using statement
block!?)

At the end of the using block the Dispose method is called
which will release all unmanaged resources. It is not
"destructed".

Arne
 
Most of the File and FileStream class' methods deal with arrays of bytes, so
one can assume that File is good for binaries. StreamReader/Writer can read
and write strings as well as bytes, so it's useful when working with text
files, especially large ones.

File's methods can read and write text from and to a file, but they do it
from a pre-allocated buffer or from a pre-allocated array of strings. And
they process the whole file at once. When you need to read a huge text file,
you can allocate a gigabyte-size string and read into it using File's
method. But would you prefer that to multiple calls to
StreamReader.ReadLine() ? With streams you can process large files and not
require lots of memory.
 
Thus wrote Zytan,
I have installed the Visual C# 2005 Code Snippets, and the snippets
for file handling use StreamReader and StreamWriter, instead of
System.IO.File.*. VB 2005 code snippets don't use these. Why are
they suggesting the use of these classes instead of the fundamental
File methods? Are they better? I thought code snippets were supposed
show things in the easy / simple way.

At the end of the day, you're always dealing with either a TextReader/TextWriter
or a Stream. Many of File's methods are convenience APIs that wrap or produce
streams and writers. It's mostly a matter of taste.


Cheers,
 
At the end of the using block the Dispose method is called
which will release all unmanaged resources. It is not
"destructed".

Ok, I am thinking that destructed = dispose. What is the difference?
When I write a destructor, and it is run, this is the object being
destructed, right?

http://msdn2.microsoft.com/en-us/library/yh598w02.aspx
shows this example:

Font font2 = new Font("Arial", 10.0f);
using (font2)
{
// use font2
}

It seems to me that font2 can still be used after the using statement
block. So what happens? Is the object destructed / disposed (sorry,
I don't know the difference), and font2 = null?

Zytan
 
File's methods can read and write text from and to a file, but they do it
from a pre-allocated buffer or from a pre-allocated array of strings. And
they process the whole file at once. When you need to read a huge text file,
you can allocate a gigabyte-size string and read into it using File's
method. But would you prefer that to multiple calls to
StreamReader.ReadLine() ? With streams you can process large files and not
require lots of memory.

Ah, so this is the difference. For me, I think whatever files I'll be
reading, I need them in memory all at once. So, it shouldn't matter.
For example, I could use:

string[] s = System.IO.File.ReadAllLines(path);

And, yes, it means s now stores the entire file, all at once, into
memory. But, I think this is ok for small files.

Thanks for the explanation, this is exactly what I wanted to know.

Zytan
 
At the end of the day, you're always dealing with either a TextReader/TextWriter
or a Stream. Many of File's methods are convenience APIs that wrap or produce
streams and writers. It's mostly a matter of taste.

Ok, thanks. So, I'll do it my way, and if I ever need to handle a
file as a stream of bytes (I don't think I do), I'll look into the
StreamReader/Writer.

Zytan
 
Zytan said:
Ok, I am thinking that destructed = dispose. What is the difference?
When I write a destructor, and it is run, this is the object being
destructed, right?

Well, it's being finalized. It *could* still survive, if something
during finalization promotes it - but generally, the object is about to
be garbage collected.

Being *disposed*, however, is very different - that's just calling the
Dispose method. After Dispose has been called, an object *may* still be
perfectly usable, and disposing of it doesn't affect garbage
collection, although calling Dispose will often suppress the finalizer
if there is one (because finalizers generally do the same thing).
http://msdn2.microsoft.com/en-us/library/yh598w02.aspx
shows this example:

Font font2 = new Font("Arial", 10.0f);
using (font2)
{
// use font2
}

It seems to me that font2 can still be used after the using statement
block. So what happens? Is the object destructed / disposed (sorry,
I don't know the difference), and font2 = null?

font2 still refers to the same object, it's just that Dispose() has
been called on it.
 
Ok, I am thinking that destructed = dispose. What is the difference?
Well, it's being finalized. It *could* still survive, if something
during finalization promotes it - but generally, the object is about to
be garbage collected.

Ok, so in C#, there are no desrtuctors. They are called
'finalizers'? I can see the method name is Finalize.
Being *disposed*, however, is very different - that's just calling the
Dispose method. After Dispose has been called, an object *may* still be
perfectly usable, and disposing of it doesn't affect garbage
collection, although calling Dispose will often suppress the finalizer
if there is one (because finalizers generally do the same thing).

So, Dispose will not call Finalize, since they usually do the same
thing. So what happens when the object goes out of scope? Doesn't
Finalize get called?
font2 still refers to the same object, it's just that Dispose() has
been called on it.

Ah, so font2 isn't null. It refers to the object. But, it's an
object that cannot be used since, it's been all but finalized?
Something seems very wrong with that.

The using statement forces Dispose to be called. But going out of
scope doesn't?
Why not just use blank braces { Object x; } so when the last } is hit,
doesn't x.Dispose get called, the same as if the braces were part of
the using statement?

Zytan
 
Zytan said:
Ok, so in C#, there are no desrtuctors. They are called
'finalizers'? I can see the method name is Finalize.

Yup, although they're generally written with the destructor-like
syntax. You should very, very rarely need one though - only when you
*directly* hold unmanaged resources. If you only have a reference to
something else which holds unmanaged resources (e.g. a Stream) then
just implementing IDisposable is enough. Allow the Stream itself to
clean up on finalization if nothing else has cleaned it up by then.
So, Dispose will not call Finalize, since they usually do the same
thing.

In fact, the finalizer usually calls Dispose.
So what happens when the object goes out of scope? Doesn't
Finalize get called?

No. The finalizer will be called *at some point* after the object is
eligible for finalization, i.e. when there are no more references to
it. There's no guarantee when that will be, or even necessarily whether
it will happen (e.g. during application termination).
Ah, so font2 isn't null. It refers to the object. But, it's an
object that cannot be used since, it's been all but finalized?
Something seems very wrong with that.

It may or may not be usable - it depends on the implementation of
Dispose. Most of the time, it's not usable, and you would rarely use
the code pattern you showed - normally you declare the variable in the
using statement itself.
The using statement forces Dispose to be called. But going out of
scope doesn't?
Correct.

Why not just use blank braces { Object x; } so when the last } is hit,
doesn't x.Dispose get called, the same as if the braces were part of
the using statement?

No, x.Dispose doesn't get called, and neither does the finalizer
automatically - not at the point at which the variable falls out of
scope. Bear in mind that by then, there may be other references to the
same object (because x could have been passed to methods, etc). .NET
doesn't do reference counting, due to both the performance penalty and
the problem of cyclic references.

..NET has no deterministic resource clean-up - you need to do it
yourself, but the using statement makes life a lot simpler than it
would otherwise be.
 
http://msdn2.microsoft.com/en-us/library/yh598w02.aspx
IMO even as the above may be a valid construction is not the best example,
shame that it's in MSDN , it should have been written like:

using ( Font font2 = new Font("Arial", 10.0f) )
{
// use font2
}

It is the second example MSDN shows. The first one is like yours.
And yes, this 2nd one is why it is so confusing to me. Maybe it would
have been best if C# didn't support it.

Zytan
 
Yup, although they're generally written with the destructor-like
syntax. You should very, very rarely need one though - only when you
*directly* hold unmanaged resources. If you only have a reference to
something else which holds unmanaged resources (e.g. a Stream) then
just implementing IDisposable is enough. Allow the Stream itself to
clean up on finalization if nothing else has cleaned it up by then.

Well, I was going to open a file for logging, and close the file at
the program's end. So, i thought I needed a finalizer. I mean, the
file doesn't close itself, does it? Or is that what Dispose is
precisely for, and perhaps it implements this. It's hard to say from
a newcomer's perspective if I have direct management of the resource
or not, such as opening / closing a file.
No. The finalizer will be called *at some point* after the object is
eligible for finalization, i.e. when there are no more references to
it. There's no guarantee when that will be, or even necessarily whether
it will happen (e.g. during application termination).

Yes, I knew about multiple references, I guess I was asking this in
terms of that being the only reference. I know C# removes the need
for me to worry about cleaning up. But, what's confusing is for
things NOT like classes and arrays, but files that I think I am
supposed to be the one manually telling it to close.
It may or may not be usable - it depends on the implementation of
Dispose. Most of the time, it's not usable, and you would rarely use
the code pattern you showed - normally you declare the variable in the
using statement itself.

Yes, I can follow I can not used the variable inside of the using
statement itself since it goes out of scope, but dealing with a
variable whose Dispose method has been automatically called, but still
is in scope, is confusing.

Right, i should know that C# doesn't guarantee when any object is
cleaned up, since the GC does it on its own clock. Which is the
reason using-statement exists, for things that must be cleaned up
right then and there.
No, x.Dispose doesn't get called, and neither does the finalizer
automatically - not at the point at which the variable falls out of
scope. Bear in mind that by then, there may be other references to the
same object (because x could have been passed to methods, etc). .NET
doesn't do reference counting, due to both the performance penalty and
the problem of cyclic references.

Yes, again, I knew about the multiple reference possibility. So,
using-statement just forces Dispose to be called at the end } where a
normal end } just says to the GC "you can clean this up when you
like" (provided there are no other references).

Zytan
 
Zytan said:
Well, I was going to open a file for logging, and close the file at
the program's end. So, i thought I needed a finalizer. I mean, the
file doesn't close itself, does it? Or is that what Dispose is
precisely for, and perhaps it implements this. It's hard to say from
a newcomer's perspective if I have direct management of the resource
or not, such as opening / closing a file.

You should close the file directly yourself, when the application is
terminating.
Yes, I knew about multiple references, I guess I was asking this in
terms of that being the only reference. I know C# removes the need
for me to worry about cleaning up. But, what's confusing is for
things NOT like classes and arrays, but files that I think I am
supposed to be the one manually telling it to close.

Exactly. .NET removes the need for you to worry about cleaning up
*memory*, but not other resources.
Yes, I can follow I can not used the variable inside of the using
statement itself since it goes out of scope, but dealing with a
variable whose Dispose method has been automatically called, but still
is in scope, is confusing.

There are some cases where it's useful. For instance, even after you've
called Dispose on a MemoryStream, you can still call ToByteArray.
Right, i should know that C# doesn't guarantee when any object is
cleaned up, since the GC does it on its own clock. Which is the
reason using-statement exists, for things that must be cleaned up
right then and there.
Exactly.


Yes, again, I knew about the multiple reference possibility. So,
using-statement just forces Dispose to be called at the end } where a
normal end } just says to the GC "you can clean this up when you
like" (provided there are no other references).

Well, the brace doesn't even do that really - the garbage collector
knows when a variable can last be used, and can collect an object even
when it's still in scope (in release mode). For instance:

object x = new object();
// Some code which uses x
// Point A
// Code which doesn't use x

At point A, if the garbage collector kicks in, it can garbage collect
the object referred to by x (assuming there are no other references)
because it knows the object can't be used by anything else.
 
Back
Top