GC.Collect can be trusted?

C

Christopher Ireland

Jon said:
If they *are* saying that, then I disagree - it's easy to assign
something to a static variable and forget about it. You'll then
potentially have what looks like a memory leak, and it would certainly
be due to a bug in your code. Not a failure to call free(), but a
failure to effectively make the object eligible for garbage
collection.

I agree. It's a question of how terminology is understood. I understand that
in an environment where memory is managed, like the .NET Framework, then
memory which is not released when the application terminates is memory that
has failed to be effectively managed by the said environment.

Managed memory which accumulates while the application is running is not
what I would understand as a memory leak (the memory hasn't "leaked" out of
control of the application) but certainly looks like one and is due, as you
say, to not making the relevant objects eligible for garbage collection.
 
C

Christopher Ireland

Jon said:
No, it doesn't mean a bug in the GC. It almost certainly means a bug
in Ward's code, where he's got a reference to an object even though
his design (or whatever) says that he shouldn't. I believe he's
trying to track down that bug. Read Peter's sentence very carefully:

<quote>
He's talking about objects to which he still has references, but
*shouldn't*, and so they stay allocated even though they should have
been released.
</quote>

You're right, I hadn't read the sentence as carefully as I should have done
and yes, re-reading it makes it clear that the bug is not in the GC.

--
Thank you,

Christopher Ireland

"Peace comes from within. Do not seek it without."
Siddhartha Gautama
 
C

Christopher Ireland

Jon said:
Certainly a *user* couldn't care less whether their memory is being
eaten due to references that are still available or due to memory
which no code knows about - and it doesn't make huge odds when it
comes to debugging, either.

No, but it does make a difference to whether or not you can effect a change
in how the memory is being used. If the memory is being used by references
that are still available then you can change it, if it due to a bug in the
GC then you can't. In an unmanaged environment you are completely
responsible for memory management and can therefore always effect a change,
AFAIK.
 
J

Jon Skeet [C# MVP]

Christopher Ireland said:
No, but it does make a difference to whether or not you can effect a change
in how the memory is being used. If the memory is being used by references
that are still available then you can change it, if it due to a bug in the
GC then you can't. In an unmanaged environment you are completely
responsible for memory management and can therefore always effect a change,
AFAIK.

No, in neither case are you completely responsible.

Very, *very* few unmanaged Windows programs don't use either third
party libraries or Win32 calls. I view a memory leak in the GC as being
comparable to a memory leak in those third party libraries or Win32
calls - basically, you have a problem which is hard to deal with in
either case.

Likewise, in both managed code and unmanaged code, memory leaks are
more *likely* to be due to bugs in your own code.

The tools you might use to track down such problems will clearly be
different, but both kinds of leak are definitely possible in both
situations.
 
C

Christopher Ireland

Jon said:
No, in neither case are you completely responsible.

One is always completely responsible for one's code and it was never my
intention to suggest otherwise. What I'm saying is that in a managed memory
environment you are not responsible for the direct allocation and
deallocation of memory and therefore have to rely on an intermediate layer
that is. What this means is that there are instances in the running of a
program where an object is not in scope and where a programmer cannot
directly destroy it, giving rise to this object "unnecessarily" occupying
memory and therefore the appearance of a memory leak.
Very, *very* few unmanaged Windows programs don't use either third
party libraries or Win32 calls. I view a memory leak in the GC as
being comparable to a memory leak in those third party libraries or
Win32 calls - basically, you have a problem which is hard to deal
with in either case.

Likewise, in both managed code and unmanaged code, memory leaks are
more *likely* to be due to bugs in your own code.

I think I've been mistaken in overly referring to bugs in the GC and I agree
with you that this is by far the least likely scenario.
The tools you might use to track down such problems will clearly be
different, but both kinds of leak are definitely possible in both
situations.

Yes, the question is: once you know which object is unnecessarily occupying
space, can you guarantee its destruction before the program terminates? I
think this is the question that Ward is going to be asking himself once he's
tracked his object down.
 
J

Jon Skeet [C# MVP]

Christopher Ireland said:
One is always completely responsible for one's code and it was never my
intention to suggest otherwise. What I'm saying is that in a managed memory
environment you are not responsible for the direct allocation and
deallocation of memory and therefore have to rely on an intermediate layer
that is. What this means is that there are instances in the running of a
program where an object is not in scope and where a programmer cannot
directly destroy it, giving rise to this object "unnecessarily" occupying
memory and therefore the appearance of a memory leak.

Indeed. My point was that even in the unmanaged world, while you will
be directly responsible for *some* (quite possibly most) of the memory
allocation, you're unlikely to be responsible for *all* of it.

Even if you're calling malloc()/free(), that's still going through an
intermediate layer. I believe free() in Windows had a subtle and rare
bug at some point, for instance...

<snip>
 
W

Willy Denoyette [MVP]

Ward Bekker said:
Hi Cristopher,

My definition of a memory leak for managed frameworks:

All objects that should be garbage collected, but can't because they are still referenced
by other objects that will not be garbage collected ;-)

Objects that are still referenced should not be GC, that would be a serious bug if the GC
collected such objects ;-)
The problem you are describing is not a real "memory leak", the problem is that you don't
know who's keeping a reference to the object, so you aren't able to release the object by
setting it's reference to null, this is an application bug disguised as a leak.
A "real leak " is part of the memory, occupied by a "non referenced object" staying
allocated on the GC heap after a GC run, if the GC can't deallocate the memory it will stay
in the heap until the process terminates, no-one is still under control of this chunk of
memory. Or otherwise, a leak in the managed heap is the result of a CLR bug, possibly a GC
bug.

Willy.
 
C

Christopher Ireland

Jon said:
Indeed. My point was that even in the unmanaged world, while you will
be directly responsible for *some* (quite possibly most) of the memory
allocation, you're unlikely to be responsible for *all* of it.

Even if you're calling malloc()/free(), that's still going through an
intermediate layer. I believe free() in Windows had a subtle and rare
bug at some point, for instance...

Sure, although I think there is a fundamental qualitative and quantitive
difference between the intermediate layer that effects memory in unmanaged
applications and in managed applications. Non-deterministic memory
management by its very nature means that programmers cannot make completely
predictable changes to memory allocation through it.
 
W

Ward Bekker

Hehe, my question caused quite some discussion. Thank you for your help!

Let me try to clarify a bit more:

1. I'm hunting for bugs in my own code, specifically for object graphs
that are still connected to the root object but are not needed any more.
These objects are correctly not garbage collected because the code did
not dereference them. I use among others Ants Profiler to look what
objects are still in memory and alive.

2. I want to make sure that all objects that can be garbage collected (
disconnected from the root object) _are_ not longer alive. The
documentation is not very clear that GC.Collect will actually throw out
the thrash.

Jon Skeet maybe explains it better what I'm trying to do: See
news://news.microsoft.com:119/[email protected]

Besides Jon's suggestion, I also found this method:
GC.GetTotalMemory(true).

According to the documentation, the true argument tells to wait for the
garbage collection to finish before returning so the result will be more
accurate. I noticed a delay when there really was stuff to collect when
executing.

Maybe I should do both ways, just to make sure ;-)

Thank you very much,

Ward
 
M

mpetrotta

2. I want to make sure that all objects that can be garbage collected (
disconnected from the root object) _are_ not longer alive. The
documentation is not very clear that GC.Collect will actually throw out
the thrash.

Seems pretty clear to me, if lacking in detail:
"However, the Collect method does not guarantee that all inaccessible
memory is reclaimed."

Michael
 
W

Willy Denoyette [MVP]

Seems pretty clear to me, if lacking in detail:
"However, the Collect method does not guarantee that all inaccessible
memory is reclaimed."


Reclaimed by who? The task of the GC is to collect the inaccessible objects and if possible
compact the heap. It's not the task of the GC to return the memory to the OS, the GC can't
even do that, it's up to the memory manager (another CLR component) to keep track of the
extra segments allocated from the process heap and (eventually) return these when they don't
contain any GC heap data (managed objects and other CLR data).
Anyway, the memory manager won't ever return the default process heap segments occupied by
the managed heap (32 MB or 64 MB ) to the OS

Willy.
 
M

mpetrotta

Reclaimed by who? The task of the GC is to collect the inaccessible objects and if possible
compact the heap. It's not the task of the GC to return the memory to the OS, the GC can't
even do that, it's up to the memory manager (another CLR component) to keep track of the
extra segments allocated from the process heap and (eventually) return these when they don't
contain any GC heap data (managed objects and other CLR data).
Anyway, the memory manager won't ever return the default process heap segments occupied by
the managed heap (32 MB or 64 MB ) to the OS

I don't see that it matters. The OP is using Ants Profiler (and its
"Live Objects" view) to profile his memory usage. I'm not very
familiar with that tool, so I just tried out the demo. That view
lists all objects on the managed heap (collected or not, apparently).
Once collected, objects should no longer appear in that view. Whether
the memory they consumed is still being used by the CLR is irrelevant
(for the OP, at any rate).

Michael
 
W

Willy Denoyette [MVP]

I don't see that it matters. The OP is using Ants Profiler (and its
"Live Objects" view) to profile his memory usage. I'm not very
familiar with that tool, so I just tried out the demo. That view
lists all objects on the managed heap (collected or not, apparently).
Once collected, objects should no longer appear in that view. Whether
the memory they consumed is still being used by the CLR is irrelevant
(for the OP, at any rate).

You have too much trust in such tools. First, collected objects should not appear in that
list, note however that NO single tool is able to show in real time what's happening in the
CLR, they are simply relying on the counter data provided by the CLR, these (and other info)
are supplied by the CLR via a call-back interface, however, the data isn't updated in
real-time fashion, more whenever you attach a profiler you are introducing side effects, the
CLR behaves differently when a profiler or debugger is attached.
A profiler by the way is no means to detect memory leaks, they are meant to profile
application behavior, this includes allocation patterns (what parts of the program is
allocating what kind and how many objects, what's their life time etc..), but they can't
show you leaked memory, debuggers are (and here I mean unmanaged debuggers not managed ones)
meant to find these kind of bugs, but keep in mind they too are quite intrusive, the crate
side effects the influence the behavior of the application. Only way to find a leak is by
taking a snapshot dump and investigate that dump off line using the right tools for managed
applications.


Willy.
..
 
C

Christopher Ireland

Willy,
Only way to find a leak is by taking a
snapshot dump and investigate that dump off line using the right
tools for managed applications.

Would you mind telling me how to create such a dump and which tools you
would use to analyse it?
 
C

Chris Mullins [MVP]

I talk a fair bit about how to get a dump here:
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/28/Default.aspx

To analyze the dump, you can either use:
- WinDbg + Son of Strike (as I talk about in the blog). This is very painful
though.
- Scitech Memory Profiler (http://memprofiler.com/). Learning their GUI will
take a while, but once you do, the power is nothing short of amazing.

The SciTech one will also compare dumps - so you can get you app in a steady
state, capture a dump, perform a an action, and capture a 2nd dump. I've
gotten alot of use out of this...
 
C

Christopher Ireland

Chris,
- Scitech Memory Profiler (http://memprofiler.com/). Learning their GUI
will take a while, but once you do, the power is nothing short of amazing.

The SciTech one will also compare dumps - so you can get you app in a
steady state, capture a dump, perform a an action, and capture a 2nd dump.
I've gotten alot of use out of this...

Many thanks for your answer! In fact, I've been using AQTime
(http://www.automatedqa.com/products/aqtime/) to profile the performance of
my applications for some time now, but this is the first time I've wanted to
use its allocation profiler. It's interesting to note what AutomatedQA say
about memory leaks in .NET applications here:
http://automatedqa.com/techpapers/net_allocation_profiler.asp

They seem to suggest that memory leaks in .NET are a problem of the GC not
working properly. Would you agree with this analysis?

Thank you,

Christopher Ireland.
 
C

Chris Mullins [MVP]

I don't agree with that at all, really.

The scenario in their article they are repeatadly calling a bug, isn't a GC
bug at all - it's a bug in the UpDown control. That control isn't removing
an event handler, thus keeping the control rooted for way longer than the
developer intended. The GC is working perfectly. It's a great article on
"how to" for tracking down a leak though....

I've yet to come across a single instance where the GC wasn't working as
designed. Given the type of work I do - large scale, high performance, high
availability server applications - and the amount of time I spend in
profilers (both performance and memory) - I know the area pretty well.

There have been a number of times where apps were leaking memory, but in all
cases it turned out to be developer bugs. Sometimes very subtle application
bugs or race conditions, but bugs nonetheless.

In this day and age, GC is pretty good. It lets junior devs build small apps
in a "fire and forget" type of way. Unfortuantly, as complexity grows, that
"fire and forget" approach has to be replaced with a much deeper
understanding of how things work.

At the end of the day, as professional developers building complex
applications, we need to have a good understanding to how Garbage Collection
works. We need to understand Dispose patterns, Finalization, how the Managed
Heap works, and other related technologies. We need to understand what a
root path is, and how it keeps things from being collected. At the more
esoteric level, if you're building really complex apps, you need someone
around who understands how Pinning and Fragmentation work, Weak References,
Object Resurrection, the differences between the Worksation and Server CLR,
what Concurrent GC is, what the Large Object Heap is, and all the other good
struff that's in there.
 
C

Christopher Ireland

Chris said:
There have been a number of times where apps were leaking memory, but
in all cases it turned out to be developer bugs. Sometimes very
subtle application bugs or race conditions, but bugs nonetheless.

Can you please confirm to me that with the GC working well, any memory leaks
that occur while the application is running leave no "footprint" in memory
when the application is closed? Another thing I would be very grateful if
you could clarify for me is whether memory leaks which occur when the (100%
managed) application is running can be seen in perfmon's memory counter (or
the task manager's physical memory usage history, if it isn't the same
thing).
In this day and age, GC is pretty good. It lets junior devs build
small apps in a "fire and forget" type of way. Unfortuantly, as
complexity grows, that "fire and forget" approach has to be replaced
with a much deeper understanding of how things work.

Yes, I agree.
At the end of the day, as professional developers building complex
applications, we need to have a good understanding to how Garbage
Collection works. We need to understand Dispose patterns,
Finalization, how the Managed Heap works, and other related
technologies. We need to understand what a root path is, and how it
keeps things from being collected. At the more esoteric level, if
you're building really complex apps, you need someone around who
understands how Pinning and Fragmentation work, Weak References,
Object Resurrection, the differences between the Worksation and
Server CLR, what Concurrent GC is, what the Large Object Heap is, and
all the other good struff that's in there.

Great, these are exactly the kind of keywords I was after. Could I please
try your patience one more time and ask you for what you consider to be a
good link (or even book!) on the matter?
 
W

Willy Denoyette [MVP]

Chris Mullins said:
I don't agree with that at all, really.

The scenario in their article they are repeatadly calling a bug, isn't a
GC bug at all - it's a bug in the UpDown control. That control isn't
removing an event handler, thus keeping the control rooted for way longer
than the developer intended. The GC is working perfectly. It's a great
article on "how to" for tracking down a leak though....

I've yet to come across a single instance where the GC wasn't working as
designed. Given the type of work I do - large scale, high performance,
high availability server applications - and the amount of time I spend in
profilers (both performance and memory) - I know the area pretty well.

There have been a number of times where apps were leaking memory, but in
all cases it turned out to be developer bugs. Sometimes very subtle
application bugs or race conditions, but bugs nonetheless.

In this day and age, GC is pretty good. It lets junior devs build small
apps in a "fire and forget" type of way. Unfortuantly, as complexity
grows, that "fire and forget" approach has to be replaced with a much
deeper understanding of how things work.

At the end of the day, as professional developers building complex
applications, we need to have a good understanding to how Garbage
Collection works. We need to understand Dispose patterns, Finalization,
how the Managed Heap works, and other related technologies. We need to
understand what a root path is, and how it keeps things from being
collected. At the more esoteric level, if you're building really complex
apps, you need someone around who understands how Pinning and
Fragmentation work, Weak References, Object Resurrection, the differences
between the Worksation and Server CLR, what Concurrent GC is, what the
Large Object Heap is, and all the other good struff that's in there.


Note also, that this article was written years ago, and is based on the
first version of the framework and never updated since then.
These are the kind of "look how great our product is, it can even find bugs
in MSFT's products" articles, but they did not find anything, this bug was
known at the time they ran the profiler, all they did was illustrate the
impact of a bug in the Windows.Forms code on the allocation pattern.
You won't be able to find these kind of "bugs" using just a profiler, you
need a debugger, a great deal of understanding how the GC and the CLR works,
a lot of experience in debugging complex systems plus quite some luck to
finds small leaks like these. Granted, a profiler is something you need in
your toolbox, it's a great tool to illustrate performance behaviors and
uncover possible bottlenecks, it can help you better understand your object
allocation patterns, where you allocate too many or to large objects, so you
can adapt your algorithms and allocation needs, but you won't ever be able
to detect small managed memory leaks by using a profiler alone.

Willy.
 
J

Jon Skeet [C# MVP]

Christopher Ireland said:
Can you please confirm to me that with the GC working well, any memory leaks
that occur while the application is running leave no "footprint" in memory
when the application is closed?

If they don't, that's an operating system failure more than a CLR
failure. An operating system really, really, really should release any
memory held by a process when the process exits (assuming it's not
deliberately sharing memory with another process etc).
Another thing I would be very grateful if you could clarify for me is
whether memory leaks which occur when the (100% managed) application
is running can be seen in perfmon's memory counter (or the task
manager's physical memory usage history, if it isn't the same thing).

They may be visible there. It's a lot more reliable to use the CLR
performance counters in perfmon though.

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top