FileStream.Close() & GarbageCollection - Memory Leak Question

T

Tom

When I use a FileStream variable's null status to determine logic flow
I can get an exception thrown because the variable is not set to null
within the FileStream.Close() method. I could use try{}, catch{}
blocks to trap out the error ... but I need to understand if
programmatically setting the variable to null creates problems in the
GarbageCollection.

i.e. >>

FileStream fs = new FileStream("C:/testfile.txt", FileMode.Open,
FileAccess.Read);

if (fs == null) Console.WriteLine("fs == null");
else Console.WriteLine("fs.Lenth = {0}", fs.Length.ToString());

fs.Close();

fs = null; // << Does this produce a memory leak?

// Without the above statement an exception is thrown below.

// Replacing above statement with >> GC.Collect();
// Produces the same error.

if (fs == null) Console.WriteLine("fs == null");
else Console.WriteLine("fs.Lenth = {0}", fs.Length.ToString());

____________________________

Without the null assignment statement above, an exception is thrown
when trying to access the Length of a closed file in the 2'nd if(fs ==
null) block.

The GC forcing produces the same error condition. The fs variable is
retaining a no longer valid reference? Seems the Close() method
"should" null the variable?

Does programmatically setting fs = null create a memory leak? I am
concerned the GarbageCollector uses the reference and by nulling it I
am creating a problem?

Thanks in advance for any clarifications.

-- Tom
 
L

Lasse Vågsæther Karlsen

Tom said:
When I use a FileStream variable's null status to determine logic flow
I can get an exception thrown because the variable is not set to null
within the FileStream.Close() method. I could use try{}, catch{}
blocks to trap out the error ... but I need to understand if
programmatically setting the variable to null creates problems in the
GarbageCollection.

i.e. >>

FileStream fs = new FileStream("C:/testfile.txt", FileMode.Open,
FileAccess.Read);

if (fs == null) Console.WriteLine("fs == null");
else Console.WriteLine("fs.Lenth = {0}", fs.Length.ToString());

fs.Close();

fs = null; // << Does this produce a memory leak?

Not at all. It just clears the contents of the variable. Somewhere in
memory the memory allocated to the object still resides, pending a
future garbage collection.
// Without the above statement an exception is thrown below.

// Replacing above statement with >> GC.Collect();
// Produces the same error.

if (fs == null) Console.WriteLine("fs == null");
else Console.WriteLine("fs.Lenth = {0}", fs.Length.ToString());

If you don't clear the variable, then yes, this will throw an exception.
Setting it to null is the correct way.
The GC forcing produces the same error condition. The fs variable is
retaining a no longer valid reference? Seems the Close() method
"should" null the variable?

No, it should not, and cannot.

Close() does not free the memory allocated to the object. What Close()
and Dispose() (of IDisposable, you should look it up) does is to close
unmanaged resources, like file handles, tcp/ip connection handles, etc.

The memory allocated to the object is still allocated, still referenced
by your variable.

However, the object has specific safeguards in place that will throw an
exception if you try to use properties or methods that needs the file to
still be open.

If you need to use the variable itself to tell the rest of the program
wether you have the file or not, clear it, like you've shown.

Close or Dispose does not null the variable because they simply don't
have access to the variable.
Does programmatically setting fs = null create a memory leak? I am
concerned the GarbageCollector uses the reference and by nulling it I
am creating a problem?

No, it does not.

The garbage collector will always be able to find all the objects. It is
on the other hand the lack of live references to allocated memory that
the garbage collector figures out what to collect.

A live reference is not necessarily a variable holding a reference to
the memory. The variable also has still have a life, otherwise the
garbage collector may still collect the memory.
 
T

Tom

Thank you Lasse Karlsen!!

I am less concerned about an error being in my code ... but still a
bit confused about understanding the inner workings of
GarbageCollection. I'm looking for a good list of the top 20 or so
most common causes of memory leaks in the C# environment.

The "Stream.Close Method" documentation states: "Closes the current
stream and releases any resources (such as sockets and file handles)
associated with the current stream."

I considered the memory a "resource" that would be freed but I have no
supporting evidence for this thought. :( The bridging between managed
and unmanaged components is certainly confusing. In my mind I thought
that if part of the component was unmanaged ... then all of it was?
Now, with your explanation, I am trying to accept that system level
buffers are also "managed" when they are created within the .Net
environment?

I am use to unmanaged malloc(), free(), fclose() and _fcloseall(). I
guess I should think of the .Net File.OpenRead() as a method that
creates a "partially" managed object?

I wish I could find a comprehensive list of unmanaged resources used
within the .Net environment. My incomplete list only has the two items
mentioned above: 1) Sockets, 2) File handles.

I've read: (i) In a mixed managed/unmanaged solution you must manually
free unmanaged memory allocations. (ii) Usage of Stream.Close() is
required to "avoid memory leaks". -- But these two statements don't
necessarily suggest the buffer portion in a FileStream object is
either managed or unmanaged.

I obviously need to read and study a lot more about .Net Memory
Management. Deeper into Yahoo I now go. For now the following *Delphi*
article has my head spinning:

http://dn.codegear.com/article/28344

I know zilch about Delphi and am hoping that a similar C# approach is
out there somewhere. The memory dump and tagging in the above article
is very interesting. Excellent tool and article ... just don't force
me into Delphi, please!

-- Tom
 
J

Jesse McGrew

Thank you Lasse Karlsen!!

I am less concerned about an error being in my code ... but still a
bit confused about understanding the inner workings of
GarbageCollection. I'm looking for a good list of the top 20 or so
most common causes of memory leaks in the C# environment.

The "Stream.Close Method" documentation states: "Closes the current
stream and releases any resources (such as sockets and file handles)
associated with the current stream."

I considered the memory a "resource" that would be freed but I have no
supporting evidence for this thought. :( The bridging between managed
and unmanaged components is certainly confusing. In my mind I thought
that if part of the component was unmanaged ... then all of it was?
Now, with your explanation, I am trying to accept that system level
buffers are also "managed" when they are created within the .Net
environment?

The only resource that's truly "managed" is memory in the garbage
collected .NET heap.

However, managed objects can still "own" unmanaged resources (pointers
to memory on the global heap, file and socket handles, etc.), which
should be released by calling a method like Close or Dispose.

If the object is designed correctly, then those unmanaged resources
will be released eventually even if you forget to call the method,
because the object will have a finalizer that runs when the GC
collects it. But you should avoid relying on that, because you never
know when the finalizer is going to run - the garbage collector works
at its own pace. That's a problem for files: users will be frustrated
when they can't move or delete a file that *should* be closed, but is
actually still open because the garbage collector hasn't gotten to it
yet.
I am use to unmanaged malloc(), free(), fclose() and _fcloseall(). I
guess I should think of the .Net File.OpenRead() as a method that
creates a "partially" managed object?

If you like. You don't need to remember which methods do that - just
look at the return type. Whenever an object implements the IDisposable
interface, that means you should call Dispose (or a similar method
like Close) when you're done using it. The "using" block in C# makes
that convenient in many cases.

What actually happens is the FileStream calls a Windows API function
to open the file, which returns a file handle, and that handle is
stored in a field of the object. But the handle is just a number, as
far as .NET is concerned, so the garbage collector doesn't
automatically know that it needs to be cleaned up.

You could think of the handle like a dry cleaning ticket. Your maid
(the garbage collector) looks at the pile of papers on your desk
(memory blocks on the managed heap) and throws them all away, not
realizing that one of them is the key to getting your suit back
(identifying an unmanaged resource). To her, it's just another paper.
So if you want to get your suit back, you have to bring the ticket
back to the cleaners (call Close or Dispose, which passes the handle
to a Windows API function) before the maid finds it.

In a technical sense, the dry cleaning ticket is indeed a piece of
paper like any other (FileStream is a managed object). The link
between that ticket and your suit is a semantic link in your mind, not
a physical aspect of the ticket itself.
I wish I could find a comprehensive list of unmanaged resources used
within the .Net environment. My incomplete list only has the two items
mentioned above: 1) Sockets, 2) File handles.

There are also handles for Windows objects like bitmaps, fonts, and
icons, and pointers to memory blocks allocated from the global and COM
heaps.
I've read: (i) In a mixed managed/unmanaged solution you must manually
free unmanaged memory allocations. (ii) Usage of Stream.Close() is
required to "avoid memory leaks". -- But these two statements don't
necessarily suggest the buffer portion in a FileStream object is
either managed or unmanaged.

You don't really need to worry about the internals of the
FileStream... just notice that it implements IDisposable.

Jesse
 
L

Lasse Vågsæther Karlsen

Tom said:
Thank you Lasse Karlsen!!

I am less concerned about an error being in my code ... but still a
bit confused about understanding the inner workings of
GarbageCollection. I'm looking for a good list of the top 20 or so
most common causes of memory leaks in the C# environment.

There are two, that I can think of, major sources of memory leaks in an
application (C# or otherwise).

There's the unintentional, bug-type, memory leak where you simply forget
to deallocate memory after you're done with it. This is typically
achieved using a non-managed programming language and just forgetting to
add the appropriate code, or forgetting to call it. Typical problem is
that you return allocated memory out of a function, which the calling
code is now responsible for deallocating, but this responsibility is
both hard to ensure and hard to communicate.

The other type of memory leak, which is somewhat related, is that you
don't explicitly remove references to large data structures when you're
done with them. While you don't have to explicitly call code to free
memory allocated in a managed world, you still have to make all the
information you got about the lifetime for your data available to the
garbage collector.

A typical example of this is to have a static variable containing a list
of some sorts, and then adding large objects to that list, and then not
removing them when you're done with them, keeping the objects around
until the program closes.
The "Stream.Close Method" documentation states: "Closes the current
stream and releases any resources (such as sockets and file handles)
associated with the current stream."

I considered the memory a "resource" that would be freed but I have no
supporting evidence for this thought. :( The bridging between managed
and unmanaged components is certainly confusing. In my mind I thought
that if part of the component was unmanaged ... then all of it was?
Now, with your explanation, I am trying to accept that system level
buffers are also "managed" when they are created within the .Net
environment?

Memory is a resource, just not like this in this context.

Dispose/Close typically disposes of "scarce" resources, like handles,
files, sockets. Memory is not considered a scarce resource in the same
manner.

It is your responsibility to explicitly call dispose/close when you're
done with the purpose of the object, file contents, socket
communication, etc.

It is the garbage collectors responsibility to remove the memory used by
the object when you no longer have any live references to it.
I am use to unmanaged malloc(), free(), fclose() and _fcloseall(). I
guess I should think of the .Net File.OpenRead() as a method that
creates a "partially" managed object?

You should look for the IDisposable interface. If an object implements
this interface, it typically uses a scarce resource and you should
always dispose of it when you're done with it. In the case of streams,
Close does the same as Dispose, and this is also typically true but
check the documentation just to make sure.

The IDisposable pattern also lends itself to a very easy to write
special syntax in C#:

using (FileStream stream = new FileStream(...))
{
... use the stream here
}

this translates almost directly to:

FileStream stream = new FileStream(...);
try
{
... use the stream here
}
finally
{
stream.Dispose();
}
I wish I could find a comprehensive list of unmanaged resources used
within the .Net environment. My incomplete list only has the two items
mentioned above: 1) Sockets, 2) File handles.

As I said above, you don't need it. Instead, look for IDisposable
support, that should be all you need to know.

Off the top of my head I can add a few items though:

- anything related to GUI objects (buttons, windows, static labels,
listview, comboboxes, etc.)
- some types of image resources (cursors, icons)
- registry objects (though I'm not 100% positive here)
I've read: (i) In a mixed managed/unmanaged solution you must manually
free unmanaged memory allocations. (ii) Usage of Stream.Close() is
required to "avoid memory leaks". -- But these two statements don't
necessarily suggest the buffer portion in a FileStream object is
either managed or unmanaged.

Stream.Close is not required to avoid memory leaks. It is needed to
close the file when you're done with it.

If you don't call Stream.Close, or Dispose, what happens is that once
you no longer have any live references to the stream, the object is
eligible for garbage collection. You're not, however, guaranteed that
this will happen any time soon. In the meantime, the file the stream
holds a reference to is open, and thus possibly locked and unavailable
to other applications, even your own should you wish to open another
stream for it.

However, once the garbage collector collects the object, the stream will
be closed correctly.

The only thing IDisposable.Dispose / Stream.Close allows you to do is
choose the time you wish to dispose of the unmanaged resources yourself.
A properly written class will do it anyway when the garbage collector
runs, but this have the consequences that unmanaged resources are held
on to and locked for a longer period of time than necessary.

Additionally there are some slight performance considerations when it
comes to just letting IDisposable objects lie around until GC picks them
up, so the rule is: IDisposable must be disposed of.
I obviously need to read and study a lot more about .Net Memory
Management. Deeper into Yahoo I now go. For now the following *Delphi*
article has my head spinning:

http://dn.codegear.com/article/28344

There are other memory managers for Delphi available that does this as
well, the methods are not directly applicable in .NET though. The JCL
class libraries comes to mind.

The two types of leaks you need to concern yourself with are:

1. Not explicitly closing streams/sockets (IDisposable) when you're done
with the objects, holding the references longer than needed
2. Holding live references to data that you no longer need, which
increases the memory usage patterns of your application

You should not have to concern yourself (much) with leaks in the sense
of unmanaged resources, unless you start mucking around with P/Invoke.
When you do, think "Here there be dragons" and find some good
information about it before diving in.
I know zilch about Delphi and am hoping that a similar C# approach is
out there somewhere. The memory dump and tagging in the above article
is very interesting. Excellent tool and article ... just don't force
me into Delphi, please!

I would recommend GUI programming in Python before recommending Delphi
nowadays :)
 
T

Tom

Thank you Jesse !! :)

Excellent clarification and example analogy.

Awesome guidelines too.

I learn a lot just lurking and I am sure your efforts as a response to
my confusion help many others as well. :)

I'm now fully alerted to the IDisposable interface and the importance
that it's presence indicates.

As an indicator of helpfulness ... I find I am reading your response
slowly and multiple times to absorb it. That's a good thing !! The
clarity is perfect. The depth requires some iteration for my thick
skull. ;)

Thanks again.

-- Tom
 
T

Tom

Thank you again >> Lasse Karlsen !!

Your explanation reflects serious experience and is helping me to
catch on to some important aspects that have remained outside my grasp
even after having read numerous books.

You (as well as Jesse) emphasize the importance of the interface
IDisposable. The implication of it's presence had not yet sunk in for
me!

The tip on "using" and the ease of it's usage is perfectly
demonstrated in your example statements. I have used using before ...
but not fully understood the interface that allows such structure in
one's code. I more or less just copied a using statement from an
example in a textbook ... now I "get it" a little more!! :))

Either the texts I've read did not adequately emphasize Dispose ... or
I simply did not assimilate it even after multiple readings. I guess
in my learning mode I assumed all the cleaning up would be done in the
destructor and I missed the importance of the interfaces. Now I am
getting my feet wet with objects that implement interfaces and am
seeing the importance of Dispose. Funny how the texts discuss
interfaces ... but skimp on the practical implications. But again,
perhaps I was just too slow to absorb so many new topics.

I'm too new with C# to dive into Python yet. But I am very curious
about it now. << Another very good thing!!

Sincere thanks again!

-- Tom
 
P

Peter Duniho

[...]
Either the texts I've read did not adequately emphasize Dispose ... or
I simply did not assimilate it even after multiple readings. I guess
in my learning mode I assumed all the cleaning up would be done in the
destructor

Another clarification: in C# there is no "destructor". There's a
finalizer that uses basically the same syntax as a C++ destructor, but
it's not a destructor.

In particular: there is no guarantee as to when a finalizer might run, or
even _if_ it might run. You cannot rely on it being executed even when
you've properly released references to the object. That's why IDisposable
is so important; it's what allows your code to explicitly tell a class
instance that you're done with it and that it can release unmanaged
resources.

For what it's worth, I think this issue is in fact poorly advertised, in
spite of its importance. It's not so much that there's a lack of
documentation; if you go looking for it, MSDN actually has a number of
useful articles and pages discussing the issue. It's just that it's kind
of easy to just start writing C# code without ever really being exposed to
the issue. You can get pretty far, depending on what you're doing,
without running into problems, even if you're not disposing things
properly.

As an example as to how it might be made more obvious: Visual Studio's
auto-formatting stuff could, when you write a variable declaration with an
initializer for a type that implements IDisposable, automatically insert a
"using" statement around the whole thing. That's a little dramatic, and I
think one would want an option to turn it off, but disposing things is so
important that I think it'd be worth it to make sure newbies are exposed
early and often to the idea of disposing things.

Anyway, don't feel so bad for not having known this already. It's just
something all people new to C# (or at least, a garbage-collecting
paradigm...finalizers and disposing aren't unique to C#) need to learn.

Pete
 
M

Marc Gravell

For what it's worth, I think this issue is in fact poorly advertised, in  
spite of its importance.

I couldn't agree more... what amazed me was that in the MS Press
framework MCP books it is hardly ever uses "using" in examples. Yes I
know page scace is restricted, but even just a few key times?
Indeed, the 70-536 book (foundation; the pre-req for any .NET MCP
stuff) says this (only) "Close and dispose of nonmemory resources in
the Finally clause of a Try block". A plus for noting that things
should be disposed, but a minus for this strategy (compared with
"using"); what a wasted opportunity... of course, there are a lot of
other problems in that particular book... ;-p

Marc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top