GC.Collect: Exactly how does it work?

F

Frank Rizzo

I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using
a command and a data adapter object, .Dispose(ing) as I go. The moment
the data is in the Dataset, the Mem Usage column in the Task Manager
goes up by 50 MB (which is about right). I then .Dispose the Dataset,
set it to null and call GC.Collect. The Mem Usage column reports that
out of 50 MB, 44MB has been reclaimed. I attempt to call GC.Collect a
few more times, but the Mem Usage never goes back to the original. 6 MB
has been lost/leaked somewhere.

What am I missing here?
Regards
 
J

James Curran

Task manager does not report the memory in use, but the memory requested
by the application from the OS. After you free it, the application (.Net
Framework) marks it as free, but hold on to it, figuring that you used that
much once, you'll need it again. The OS can seize the memory back, if it
needs it for some other application, but in your test, it didn't need to.

--
--
Truth,
James Curran
[erstwhile VC++ MVP]

Home: www.noveltheory.com Work: www.njtheater.com
Blog: www.honestillusion.com Day Job: www.partsearch.com
 
F

Frank Rizzo

James said:
Task manager does not report the memory in use, but the memory requested
by the application from the OS. After you free it, the application (.Net
Framework) marks it as free, but hold on to it, figuring that you used that
much once, you'll need it again. The OS can seize the memory back, if it
needs it for some other application, but in your test, it didn't need to.

So how can I measure the real memory usage of the application?
 
I

Ignacio Machin \( .NET/ C# MVP \)

depend of what do you mean with "real" ?

for the OS or other programs your memory usage is what report the TM. It's
the chunk of memory assigned to your program.

Frankly I don;t know for sure how to know the memory being used by live
objects. what does GC.GetTotalMemory gives you?
On a second though I think that unless the GC maintain a counter of memory
allocated /freed there is no way to know this. Maybe somebody with deeper
knowledge of the GC implementation can gives you a better answer.


cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
 
W

Willy Denoyette [MVP]

Frank Rizzo said:
I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using a
command and a data adapter object, .Dispose(ing) as I go. The moment the
data is in the Dataset, the Mem Usage column in the Task Manager goes up
by 50 MB (which is about right). I then .Dispose the Dataset, set it to
null and call GC.Collect. The Mem Usage column reports that out of 50 MB,
44MB has been reclaimed. I attempt to call GC.Collect a few more times,
but the Mem Usage never goes back to the original. 6 MB has been
lost/leaked somewhere.

What am I missing here?
Regards

The GC heap is just another Win32 process heap, initially created by the OS
on request of the CLR, consisting of two segments of 16 MB each (16Kb
committed), one for the Gen0-2 objects and one segment for the Large Object
heap.
When you start to instantiate (non-large) objects, the committed space in
the heap (the first segment) starts to grow. Now suppose that you keep
instantiating objects without ever releasing any instance until the segment
gets full, when that happens the CLR asks the OS for another segment of 16
MB (16Kb committed) and continues to allocate object space from that
segment.
Let's suppose you have the second segment full when you start to release all
of the allocated objects (supposed it's possible) , the GC starts to collect
and compact the heap, say until all objects are gone. That leaves you with a
GC heap of 32 MB, consisting of two segments of 16MB committed space. The GC
has plenty of free space in the heap, but the heap space is not returned to
the OS unless there is memory pressure.
Under memory pressure, the OS signals the CLR to trim it's working set, and
the CLR will return the additional segment to the OS.
So what you did notice is simply what is described above, you have plenty of
free memory and the OS is not reclaiming anything from the running
processes.


Willy.
 
F

Frank Rizzo

Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is
that what GC.Collect does: just collect & compact, but not release/trim
working set?

If that's the case, how come the Mem Usage column in the Task Manager
does reduce when GC.Collect is executed (and there is no memory
pressure)? And additional question here: how can I signal the CLR to
reduce (i.e release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards
 
W

Willy Denoyette [MVP]

Frank Rizzo said:
Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is that
what GC.Collect does: just collect & compact, but not release/trim working
set?

If that's the case, how come the Mem Usage column in the Task Manager does
reduce when GC.Collect is executed (and there is no memory pressure)? And
additional question here: how can I signal the CLR to reduce (i.e
release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards

Ok, let me start with a small correction and a disclaimer. The disclaimer
first, what I'm talking about is valid for v1.x and only for the workstation
version of the GC. The correction is that at the start the CLR reserves 2
segments of 16 MB (each having 72kb committed) for the gen0-2 heap plus a 16
MB segment for the LOH.

Consider following (console) sample and say we break at 1 2 and 3
respectively to take a look at the managed heap:

int Main() {
[1]
ArrayList [] al = new ArrayList [1000000];
for (int m = 0; x < 1000000; m++)
al[m] = new ArrayList(1);
[2]
for (int n = 0; n < 1000000; n++ )
{
al[n] = null;
}
GC.Collect();
[3]

At the start [1] of a (CLR hosted) process the GC heap looks like this:

|_--------------|_----------------|......................|---------------|
S0 S1 free
LOH 16MB
S0 = 16MB - 72kb Committed regions (_)
S1 = 16MB - 72kb Committed regions
objects allocated at the start of the program fits in the initial committed
part of the S0 segment, so this committed region contains gen0, 1 and 2.
Say the number of reachable objects account for 6kb heap space here.

When we break at 2, the heap has grown such that S0 and S1 are completely
filled (committed regions) and a third segment had to be created.

|______________|_______________|....|________------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
S0 and S1 contains Gen2 objects (those that survived recent Collections)
S2 now holds Gen1 and Gen0
Total object space ~42Mb

Let's Force a Collection and break at 3, now the heap looks like:

|_---------------|____------------|....|_____----------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
Total object space = what we had at [1] (6kb), but as you notice the CLR
didn't de-commit all regions and didn't return segment S2 to the OS.
The amount of non de-committed region space depends on a number of
heuristics like; the allocation scheme and the frequency of most recent
object allocations.
When you run above sample you'll see that x, y, z accounts for ~10Mb (your
mileage may vary of course), so when you look at the working set of the
process, you'll notice a growth of ~10MB too. So say we started with 6MB at
[1], we will see 16MB when we are at [3].

What you could do (but you should never do that) is try to reduce the
working set of the process by setting the Process.MaxWorkingSet property,
note that this will not change the heap lay-out and will not return anything
to the OS, only thing that is done is force a page-out of unused process
pages.
Changing the committed region space and the allocated segment space is in
the hands of the CLR and the OS, both of them know what to do and when much
better than you do so keep it that way, after all this is why GC memory
allocators are invented right?

Willy.
 
F

Frank Rizzo

Willy, thanks, very enlightening. I ran the test and it turned out just
like you said. I do have a couple of followup questions:

1. Where do you get all this information? I've read a lot of
literature on this topic (Jeffrey Richter's work and some others, gotten
to know the Allocator Profiler, etc...), but I haven't seen anywhere any
references to the size of segment commited ram, etc...

2. What constitues an LOH, how big does an object have to be? What are
the rules for compacting/disposing/releasing it.

3. You mentioned that this applies to the workstation version of the
CLR. My software will run on Win2k servers and Windows 2003 servers
(not advanced, just standard). How are the rules different for the
servers?

4. In the example you described, after the 3rd breakpoint, I applied
some memory pressure (the PC diped into virtual memory). The Mem Usage
column of the console app kept going lower and lower (the more pressure
I applied). Eventually it bottomed out at 100k. Am I to believe that
the whole little console app can be run in 100k? If not, where did it
all go?

Thank you.

Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is that
what GC.Collect does: just collect & compact, but not release/trim working
set?

If that's the case, how come the Mem Usage column in the Task Manager does
reduce when GC.Collect is executed (and there is no memory pressure)? And
additional question here: how can I signal the CLR to reduce (i.e
release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards


Ok, let me start with a small correction and a disclaimer. The disclaimer
first, what I'm talking about is valid for v1.x and only for the workstation
version of the GC. The correction is that at the start the CLR reserves 2
segments of 16 MB (each having 72kb committed) for the gen0-2 heap plus a 16
MB segment for the LOH.

Consider following (console) sample and say we break at 1 2 and 3
respectively to take a look at the managed heap:

int Main() {
[1]
ArrayList [] al = new ArrayList [1000000];
for (int m = 0; x < 1000000; m++)
al[m] = new ArrayList(1);
[2]
for (int n = 0; n < 1000000; n++ )
{
al[n] = null;
}
GC.Collect();
[3]

At the start [1] of a (CLR hosted) process the GC heap looks like this:

|_--------------|_----------------|......................|---------------|
S0 S1 free
LOH 16MB
S0 = 16MB - 72kb Committed regions (_)
S1 = 16MB - 72kb Committed regions
objects allocated at the start of the program fits in the initial committed
part of the S0 segment, so this committed region contains gen0, 1 and 2.
Say the number of reachable objects account for 6kb heap space here.

When we break at 2, the heap has grown such that S0 and S1 are completely
filled (committed regions) and a third segment had to be created.

|______________|_______________|....|________------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
S0 and S1 contains Gen2 objects (those that survived recent Collections)
S2 now holds Gen1 and Gen0
Total object space ~42Mb

Let's Force a Collection and break at 3, now the heap looks like:

|_---------------|____------------|....|_____----------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
Total object space = what we had at [1] (6kb), but as you notice the CLR
didn't de-commit all regions and didn't return segment S2 to the OS.
The amount of non de-committed region space depends on a number of
heuristics like; the allocation scheme and the frequency of most recent
object allocations.
When you run above sample you'll see that x, y, z accounts for ~10Mb (your
mileage may vary of course), so when you look at the working set of the
process, you'll notice a growth of ~10MB too. So say we started with 6MB at
[1], we will see 16MB when we are at [3].

What you could do (but you should never do that) is try to reduce the
working set of the process by setting the Process.MaxWorkingSet property,
note that this will not change the heap lay-out and will not return anything
to the OS, only thing that is done is force a page-out of unused process
pages.
Changing the committed region space and the allocated segment space is in
the hands of the CLR and the OS, both of them know what to do and when much
better than you do so keep it that way, after all this is why GC memory
allocators are invented right?

Willy.
 
W

Willy Denoyette [MVP]

Frank, See inline.

Willy.

Frank Rizzo said:
Willy, thanks, very enlightening. I ran the test and it turned out just
like you said. I do have a couple of followup questions:

1. Where do you get all this information? I've read a lot of literature
on this topic (Jeffrey Richter's work and some others, gotten to know the
Allocator Profiler, etc...), but I haven't seen anywhere any references to
the size of segment commited ram, etc...
Doing a lot of debugging, using low level profilers and tools, and peeking
into the CLR sources. Note also that a managed process is just a Win32
process, the OS has no idea what the CLR is, the process data structures are
exactly the same as another non CLR win32 process, the CLR manages his own
tiny environment and has his own memory allocator and GC, but this ain't
nothing new, the VB6 runtime also has a GC and a memory allocator, C++
runtimes do have different possible memory allocators and all of them are
using the common OS heap/memory manager.
2. What constitues an LOH, how big does an object have to be? What are
the rules for compacting/disposing/releasing it.

Objects larger than 85 kb are going to the LOH. The rules for disposing and
releasing are the same as for the smallerobjects. Compacting of the LOH is
not done only collecting the garbage.
3. You mentioned that this applies to the workstation version of the CLR.
My software will run on Win2k servers and Windows 2003 servers (not
advanced, just standard). How are the rules different for the servers?
The GC server version must be explicitely loaded and is only available for
multi-proc machines (this includes HT). You can host the server GC version
by specifying it in your applications config file:
<runtime>
<gcServer enabled="true" />
</runtime>
or, by hosting the CLR.

4. In the example you described, after the 3rd breakpoint, I applied some
memory pressure (the PC diped into virtual memory). The Mem Usage column
of the console app kept going lower and lower (the more pressure I
applied). Eventually it bottomed out at 100k. Am I to believe that the
whole little console app can be run in 100k? If not, where did it all go?

The trimmed memory R/W pages go to the paging file, the RO pages are thrown
away and will be reloaded from the image file (.exe, .dll,etc...) when
needed. No, a console application cannot run in 100Kb, the missing pages
will be reloaded from the page file or the load libraries. That's why you
should never trim the working set yourself, all you are doing is initiate a
lot of page faults with a lot of disk I/O as result.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top