Thoughts on garbage collection

R

Rein Petersen

Is there any way to adjust thread priority for the garbage collector?

Would be nice if I could tune the thread priority rules for the garbage
collector too...

From my readings of the process, when memory becomes limited, the garbage
collector is moved up in priority. I believe they must be overly simplified
explanations because I would tend to guess that Microsoft would have allowed
for more pre-emptive reclaiming of memory to best gracefully degrade (share)
performance. The few simple benchmark tests have led me to that conclusion
anyway.

Still, I wonder if MS can really dynamically tune their garbage collector
for all occassions or if we are better off making our own fine tunings. I
only find the ability to manage assemblies, security policy, and remoting.
Ideally, Local Machine Settings for garbage collection tuning could be
protected from override in software settings using security policy.

Can anyone refer me to text on the subject (if it exists)?

Rein
 
J

Jan Tielens

You could implement the IDisposable interface and design pattern. If you
really want to get rid of a specific object, you can call the dispose method
of that object. In general this is used if your dealing with unmanaged
resources. So I don't know if it could be of any help for you. You can check
it out here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/
frlrfSystemIDisposableClassTopic.asp

--
Greetz

Jan Tielens
________________________________
Read my weblog: http://weblogs.asp.net/jan
 
S

Stu Smith

Are you finding the GC doesn't collect often enough? Collects too often?

What's the actual problem here, or is it more of a general thought?
 
R

Rein Petersen

You could implement the IDisposable interface and design pattern. If you
really want to get rid of a specific object, you can call the dispose method
of that object. In general this is used if your dealing with unmanaged
resources. So I don't know if it could be of any help for you. You can check
it out here:
http://msdn.microsoft.com/library/d...ref/html/frlrfSystemIDisposableClassTopic.asp

Yep, IDisposable provides a means to explicity rid of (larger) objects when
you know they are out of scope instead of waiting for the Garbage collector
to kick in when memory gets used up and it's recommended for unmanaged
resources (to kill em once ur done with them). But, in a scenario where
several important threads are running, the gc thread may get little
processor time to clean up until memory runs out and it becomes absolutely
necessary to increase it's priority. It would be nice to ramp the Garbage
Collector's thread priority inversely proportionate with the amount of ram
available. Without the ability to tune the priority of the gc, I'm inclined
to Dispose() everything - managed and unmanaged alike.

My biggest concern is to fully realize how my application will perform (and
affect other processes) when system resources (processor and ram) run low.
It's important to me that all processes play nicely in the sand box. I don't
necessarily want the GC to go into overdrive and hog the processor (when the
memory runs out because it's kept in low priority until the very last
moment). It could be very disruptive to other processes trying to gracefully
share the processor. I'd rather allow it to run in equal priority and keep a
tight lid on memory usage.

It might be something MS considers for a later version of the CLR - until
then, I'll be fine explicity calling Dispose() where I'm concerned about
stagnant memory usage.

Rein
 
R

Rein Petersen

Stu Smith said:
Are you finding the GC doesn't collect often enough? Collects too often?

What's the actual problem here, or is it more of a general thought?

Hi Stu, yeah I'm just trying to be thorough in my understanding of the gc.
And, I think the ability to tune gc thread priority would help me control
memory usage and buffer processing spikes in a scenario where resources are
scarce - I'm looking to accomodate the most graceful degradation of system
performance as resources become scarce. I'm certain MS has tuning rules
builtin to the gc but they aren't published (as far as I can tell). Leaves
me in limbo and I'm inclined to try and take matters into my own hands so
that I can guarantee behaviour. Seems to make sense that we would be able to
tune the gc but I've yet to encounter anything on the subject... I'm
starting to think I may have to wait for the next version of the CLR. If
it's not available now (GC thread tuning) then certainly it should be a
consideration for the next version.

Rein
 
M

Mickey Williams

Yep, IDisposable provides a means to explicity rid of (larger) objects when
you know they are out of scope instead of waiting for the Garbage
collector

IDisposable does not enable you to '[get] rid of' objects. Only the GC does
that. The purpose of the Dispose pattern and IDisposable is to enable scarce
resources such as window handles and database connections to be closed.
 
J

Jon Skeet [C# MVP]

Rein Petersen said:
Yep, IDisposable provides a means to explicity rid of (larger) objects when
you know they are out of scope instead of waiting for the Garbage collector
to kick in when memory gets used up and it's recommended for unmanaged
resources (to kill em once ur done with them).

It's *only* recommended for unmanaged resources. It *doesn't* get rid
of objects at all.
But, in a scenario where
several important threads are running, the gc thread may get little
processor time to clean up until memory runs out and it becomes absolutely
necessary to increase it's priority.

I'm not actually convinced there *is* a GC thread normally - on
multiprocessor systems there's a CLR which collects concurrently, but
most of the time the only extra thread involved in garbage collection
(as far as I'm aware) is the finalizer thread, which rarely has much to
do.
It would be nice to ramp the Garbage
Collector's thread priority inversely proportionate with the amount of ram
available. Without the ability to tune the priority of the gc, I'm inclined
to Dispose() everything - managed and unmanaged alike.

That suggests you don't understand what Dispose() does.

It might be something MS considers for a later version of the CLR - until
then, I'll be fine explicity calling Dispose() where I'm concerned about
stagnant memory usage.

Again, Dispose() has little to do with actual memory usage. (There are
a few situations where it would help, but nothing worth going into
detail about here.)
 
W

Willy Denoyette [MVP]

Why, the GC is self tuning, what makes you think you can do better?
The priority level the Finalizer thread runs at, is higher than normal when
he runs, but that doesn't mean that the finalizer runs more often than other
threads that run at normal priority, the Finalizer thread is fired by the
CLR, when the CLR finds it's time to clean-up the garbage, this is all based
on some heuristics and some profiling info collected by the CLR when your
code runs, you don't have access to this info from user code, so IMO you
can't do better, unless you write your own CLR ;-).

Willy.
 
R

Rein Petersen

Willy Denoyette said:
so IMO you
can't do better, unless you write your own CLR ;-).

Willy.

Quite likely - but I would like to know what heuristics or rules the GC uses
to tune itself.

Rein
 
D

Dave

I'm not actually convinced there *is* a GC thread normally - on
multiprocessor systems there's a CLR which collects concurrently, but
most of the time the only extra thread involved in garbage collection
(as far as I'm aware) is the finalizer thread, which rarely has much to
do.
Hmm? I'm pretty sure I've read there is a GC thread, even on a single CPU
system, and definitely on a multi-CPU system . I would think it would be
difficult to write a GC that used the thread's registers as possible roots
while running on the same thread.

What evidence do you have to support this?
 
S

Stu Smith

The most useful tools I've found for looking at (managed) memory usage are
CLR profiler, and the SciTech memory profiler. In many cases they can be
quite an eye-opener.

In all the cases where I've initially thought "that's rather a lot of memory
being consumed", it's /always/ been my fault -- perhaps a single reference
keeping a whole tree alive, classes with finalisers promoting objects to the
next generation, over-use of SqlString; whatever.

If you're using managed memory correctly, it won't matter when the GC
fires -- once you've lost all references to an object tree, you can never
reference them again, so in that sense it won't matter when the GC
collects -- once available, always available.
 
J

Jon Skeet [C# MVP]

Dave said:
Hmm? I'm pretty sure I've read there is a GC thread, even on a single CPU
system, and definitely on a multi-CPU system . I would think it would be
difficult to write a GC that used the thread's registers as possible roots
while running on the same thread.

What evidence do you have to support this?

I don't - but then I don't have any evidence that there's a separate GC
(as opposed to finalization) thread either. What would such a thread
do, exactly? Any thread which needs to allocate any heap memory would
have to stop for the GC thread and vice versa, so why not just do it
"inline" as it were?

(To answer my own question, of course the GC thread could be looking at
the heap while other threads were all waiting. It still seems to be a
bit of a pain in terms of doing stuff normally, when other threads are
creating a lot of objects.)

I can't see any easy way of listing the threads running in the CLR,
unfortunately - and any GC thread doing so may well be hidden even if
there is one.

You may well be right that there's an extra thread, and I'd be
interested to read about it - does anyone have any documentation either
way? At the moment both of us are guessing, really.
 
D

Dave

I don't - but then I don't have any evidence that there's a separate GC
(as opposed to finalization) thread either. What would such a thread
do, exactly? Any thread which needs to allocate any heap memory would
have to stop for the GC thread and vice versa, so why not just do it
"inline" as it were?
I agree that this is logical, but there are other reasons for not doing
this. For one, this would require the GC thread to run on top of whatever
thread was currently active. While this is simple, it has side effects; e.g.
what if an exception is thrown? The SEH mechanism will walk a stack that
perhaps it shouldn't. What if a security check must be made? The CAS check
walks the stack. Perhaps the GC takes care of these issues, but it would
have to setup a special environment before begining the actual GC cycle. It
might be easier to create a special thread for this purpose that had the
correct state already in place.
(To answer my own question, of course the GC thread could be looking at
the heap while other threads were all waiting. It still seems to be a
bit of a pain in terms of doing stuff normally, when other threads are
creating a lot of objects.)



I can't see any easy way of listing the threads running in the CLR,
unfortunately - and any GC thread doing so may well be hidden even if
there is one.

Actually, this one is easy, at least for the current version of the CLR. If
you list all the OS threads that the CLR "knows" about there is a one-to-one
mapping (at least for a simple console app) between hard OS threads and
managed threads.

You may well be right that there's an extra thread, and I'd be
interested to read about it - does anyone have any documentation either
way? At the moment both of us are guessing, really.

--

The best docs I've found on the internals are in ROTOR. While the ROTOR
codebase is not nearly as sophisticated as the commercial version it does
offer some insight. The book "Shared Source CLI" has a chapter on the GC
which states "...all threads running managed code are suspended (except, of
course, for the thread performing the GC)." and "...In Rotor, your thread
will trigger a GC only when it asks for a collection explicitly, when it
performs an object allocation, or else when it is running JIT-compiled code
that polls. The last case involves generating calls from within the JIT
compiler that offer to yield the thread if necessary......indicates to the
thread-scheduling machinery that it would be safe to suspend the thread and
perform a collection."

Unfortunately, I found nothing which explicitly stated one way or the other
which thread it was running on, but the quotes above clearly indicate a
preference for suspending all threads except for the GC thread.

This isn't proof, and I haven't gone through the sources enough to know, but
perhaps someone else here has. But if I was to hazard a guess it would be
that there is a separate GC thread.
 
J

Jon Skeet [C# MVP]

Actually, this one is easy, at least for the current version of the CLR. If
you list all the OS threads that the CLR "knows" about there is a one-to-one
mapping (at least for a simple console app) between hard OS threads and
managed threads.

Yeah - but how can you tell which are managed threads doing other stuff
and which are GC threads?
The best docs I've found on the internals are in ROTOR. While the ROTOR
codebase is not nearly as sophisticated as the commercial version it does
offer some insight. The book "Shared Source CLI" has a chapter on the GC
which states "...all threads running managed code are suspended (except, of
course, for the thread performing the GC)."

Does that include the finalizer thread then?
and "...In Rotor, your thread
will trigger a GC only when it asks for a collection explicitly, when it
performs an object allocation, or else when it is running JIT-compiled code
that polls. The last case involves generating calls from within the JIT
compiler that offer to yield the thread if necessary......indicates to the
thread-scheduling machinery that it would be safe to suspend the thread and
perform a collection."
Right.

Unfortunately, I found nothing which explicitly stated one way or the other
which thread it was running on, but the quotes above clearly indicate a
preference for suspending all threads except for the GC thread.

Yes - but "the GC thread" in this case could be "the thread which
triggered GC" here. (I could well be misreading it, of course...)
This isn't proof, and I haven't gone through the sources enough to know, but
perhaps someone else here has. But if I was to hazard a guess it would be
that there is a separate GC thread.

You've persuaded me to the extent that I'd rather not hazard a guess
either way now :)
 
M

Mickey Williams

Hmm? I'm pretty sure I've read there is a GC thread, even on a single CPU
system, and definitely on a multi-CPU system . I would think it would be
difficult to write a GC that used the thread's registers as possible roots
while running on the same thread.

It's called thread hijacking. The JIT'r and other parts of the runtime
conspire together to determine points at which a thread can be safely
stopped and/or co-pted - for example, if a method can be seen to have a
reasonable and predictably short time-boundary, the return address on the
stack can be substituted for the address for a GC work method. See this link
to MSDN mag:
http://msdn.microsoft.com/msdnmag/issues/1200/GCI2/GCI2.asp
 
M

Mickey Williams

The best docs I've found on the internals are in ROTOR. While the ROTOR
codebase is not nearly as sophisticated as the commercial version it does
offer some insight.

My understanding is that the SSCLI release does not include any of the
commercial GC algorithm. It shares most of the heap structure, but the GC is
simplified.
 
D

Dave

Yeah - but how can you tell which are managed threads doing other stuff
and which are GC threads?
NOW you are asking something that I cannot answer :)
Does that include the finalizer thread then?
When a garbage collection cycle is in progress all threads that could
disturb its tracing the roots have to be suspended, and since the finalizer
thread can do this, it would also have to be suspended.

Yes - but "the GC thread" in this case could be "the thread which
triggered GC" here. (I could well be misreading it, of course...)
True.


You've persuaded me to the extent that I'd rather not hazard a guess
either way now :)

--
me too :)

As an aside, I expect there's a lot of very bright folks at MSFT working on
squeezing every last ounce of performance from the GC, and perhaps some are
even thinking about a deterministic version of one.
 
D

Dave

The best docs I've found on the internals are in ROTOR. While the ROTOR
My understanding is that the SSCLI release does not include any of the
commercial GC algorithm. It shares most of the heap structure, but the GC is
simplified.

--
That's my understanding as well (the book states as much). But the ROTOR
sources are not entirely dissimilar, and even if the GC cycle itself is not
as sophisticated there is not way to tell (unless you can look at MSFT's
source) if this includes the threading model it uses as part of the GC
cycle.
 
D

Dave

It's called thread hijacking. The JIT'r and other parts of the runtime
conspire together to determine points at which a thread can be safely
stopped and/or co-pted - for example, if a method can be seen to have a
reasonable and predictably short time-boundary, the return address on the
stack can be substituted for the address for a GC work method. See this link
to MSDN mag:
http://msdn.microsoft.com/msdnmag/issues/1200/GCI2/GCI2.asp
Thanks for the link. I'd read that article a long time ago and had forgotten
about Richter's description about thread hijacking.

He states: "...When the currently executing method returns, this special
function will execute, suspending the thread....When the collection is
complete, the thread will resume and return to the method that originally
called it. "

This implies that the hijacked thread is suspended, implying that the GC
cycle is run on a separate thread. Have you any information that directly
addresses the issue of whether the GC collection cycle is run on a separate
thread or on a "borrowed" thread?

Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top