Threads & Garbage Collection

M

Martin Maat

I am puzzled.

I have this object that uses a thread. The thread is encapsulated by the
object, the object has Start and Stop methods to enable the client to start
or stop the thread.

I found that the object will not be garbage collected while the thread is
running. Fair enough, the documented explanation is that the GC compresses
objects on the heap and needs to update references, the references will be
invalid for a couple of moments which could be a problem if any thread is
still using them while the GC is at work. So I have to stop my object
explicitely before the application shuts down. In fact, if I don't, the
application will not shutdown at all, the CLR will probably be waiting for
me to stop the thread so it can do its final cleanup. Like I said, fair
enough.

But hey... This means that any applicaon that has some background thread
running (which is not uncommon) will _never_ be subject to garbage
collecting while it is running. This can't be right. It is a very common
design pattern to have a modest background thread running that periodically
does some useful things and then goes to sleep for a couple of milliseconds.
Does this prevent the GC to kick in at all on the application's resources?
It seems so, the finalizer of my object does not kick in while the thread is
running.


What is true here and how do I go about it? It would help if there were a
way for the thread to say "Hey, GC, I will enter a safe state now just to
allow you to do your stuff. When you're done, wake me up again, okay?" But I
haven't found a way to do that (bluntly calling GC.Collect() from the thread
does not make any difference). So I am probably missing something. What is
happening? Is there really no garbage collection for applications that do
not take down all their threads on a regular basis?

Martin.
 
J

Jochen Kalmbach

Martin said:
So
I have to stop my object explicitely before the application shuts
down. In fact, if I don't, the application will not shutdown at all,

Set the IsBackground member of the thread to "true" and then your app will
terminate the thread if the app terminates.

<msdn>
A thread is either a background thread or a foreground thread. Background
threads are identical to foreground threads, except that background threads
do not prevent a process from terminating. Once all foreground threads
belonging to a process have terminated, the common language runtime ends
the process by invoking Abort on any background threads that are still
alive.
</msdn>

See:
http://msdn.microsoft.com/library/en-
us/cpref/html/frlrfSystemThreadingThreadClassIsBackgroundTopic.asp
But hey... This means that any applicaon that has some background
thread running (which is not uncommon) will _never_ be subject to
garbage collecting while it is running.

You have a foreground thread and no background thread...
But from GC view there is no difference...

Does this prevent the GC to kick in at all on
the application's resources?
No.

It seems so, the finalizer of my object
does not kick in while the thread is running.

Maybe you have a reference to this object...

To perform a "manual" GC you should call

GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();

By the way. You should avoid using finalizers!!!
If you need to free resources or other things you should use the IDisposabe
pattern (which can be easy used with the C# "using" statement).

What is true here and how do I go about it? It would help if there
were a way for the thread to say "Hey, GC, I will enter a safe state
now just to allow you to do your stuff. When you're done, wake me up
again, okay?"

What do you matter about GC !?
If the GC wants a garbage collection it will suspend ALL threads in the
process, does a collection and resumes all threads again...



By the way: What is your problem ?


--
Greetings
Jochen

Do you need a memory-leak finder ?
http://www.codeproject.com/tools/leakfinder.asp


Do you need daily reports from your server ?
http://sourceforge.net/projects/srvreport/
 
J

J.Marsch

Martin:

I'm not sure that the behavior that you are seeing affects Garbage
collection as a whole -- rather, just the object that owns the method that
the thread runs.

1 fact that might interest you: Even in unmanaged applications (like C++ or
Delphi), if you let a thread "run away", without exiting it, it will hold
your process open.

Now, here's what I _think_ is going on in .Net: (why your object stays
alive):
Your object encapsulates the thread, right? So, the delegate that you
passed to the System.Thread's constructor is an instance member of your
manager object object, right?

A delegate contains a member called Target. Target is a reference to the
object that has the method that is being invoked by the delegate. So, when
you construct the thread, and you pass it a ThreadStart delegate, you are
giving out a reference to your object. So, there is no (or very little)
magic going on here. If I were going to place bets, I'd guess that it goes
something like this:

You have an O/S thread that is alive. It somehow keeps alive a reference to
the System.Threading.Thread object that manages it (this part might be
magic, but I though that I read somewhere that a reference to the Thread
Object is put in TLS on the thread that it manages). The thread object
holds a reference to a delegate (your ThreadStart delegate), which, in turn,
references your management object (through the Target property).

If that's right, there is no special GC behavior going on, there is just a
valid reference to your management object until the thread goes away. So,
your object is held in memory.

Facts vs hypothesis:

Fact: I know for a fact that a delegate will hold its target (the owner of
the method that will be invoked) in memory. This is well tested, documented
behavior.

Fact: I know that you hand a delegate to the Thread object in the
constructor.

Hypothesis: It's pretty likely that the Thread Object holds on to your
delegate for its lifetime. Therefore the target object cannot be eligible
for collection until the Thread object is eligible for collection

Hypothesis: I'm pretty sure that the ThreadObject is kept alive at least
until thread that it manages exits (I'm basically positive of this, but I
haven't decompiled the code, so I won't call it a fact);

Conclusion: The System.Thread stays alive until the thread is exited. The
System.ThreadObject holds reference to your management object (through a
delegate). So, your management object cannot be released from memory until
the thread exits.
 
M

Martin Maat

Jochen Kalmbach said:
Set the IsBackground member of the thread to "true" and then your app will
terminate the thread if the app terminates.

Okay, that worked and it will do. It isn't exacly what I want though, the
thread is just killed by the runtime, I don't get a chance to end it
gracefully.
http://msdn.microsoft.com/library/e...stemThreadingThreadClassIsBackgroundTopic.asp

You have a foreground thread and no background thread...
But from GC view there is no difference...


Maybe you have a reference to this object...

I do in the main form class and I expected the finalizer to kick in when the
form was "collected". Somehow I assumed the GC would be so kind as to
collect all of my app's junk when I would close the form. I know I cannot
rely on the GC to kick in at any time if I don't call Collect explicitely
yet expected it to do so because it DOES if my thread isn't running. That
was puzzeling me, I concluded that if a thread keeps the GC from doing what
it normally does, that thread would effectively keep the GC from performing
its primary task.
By the way. You should avoid using finalizers!!!
If you need to free resources or other things you should use the IDisposabe
pattern (which can be easy used with the C# "using" statement).

Yeah, I read all about it today. Finalizers are not bad by the way, the main
difference is that finalizers are called by the GC just before the object is
collected and I have to call Dispose() myself at a time I find appropriate.
Neither of the mechanisms solved my problem though, I was looking for an
implicit way to end my thread in case the object was collected while the
thread was running. This doesn't seem possible, not in a controlled way
anyway:

1. if my thread is a foreground thread, no garbage collection occurs on my
object, not even when I put it in a using statement and wait till the cows
come home. The finalizer is _not_ called.

2. if my thread is a background thread, the runtime just bluntly aborts it
and then collects my object. Well, thank you very much! It is like kicking
in my door for me (ruining the lock) and then handing me the keys to open
it.

It's not nice. All I want is that chance to signal my thread so it can end
gracefully. Now I am stopping my thread in the Closed event method which is
okay, my object is a singleton and I don't really need it to be fool-proof.
But it means the thread isn't really encapsulated in the object, it isn't
the OO-type of behavior that you are used to implementing. If the object
dies, I want that shot at cleaning things up which I do not get if a thread
is involved.

The GC should first call Finalize(), then look if there are still any
threads left and if they are take them down (I had my chance, didn't use it,
I had it coming). Perhaps the GC doesn't know what thread belongs to what
object so this approach isn't possible yet it would be a lot nicer if you
ask me.
If the GC wants a garbage collection it will suspend ALL threads in the
process, does a collection and resumes all threads again...

Well, I hope so. It's not what I found today though. Yet it must usually be
the case since GC simply would not work at all if it weren't.

Thanks for the information (especially on the IsBackground property)

Martin.
 
J

J.Marsch

If you want to kill the thread gracefully, you are going to need to provide
some mechanism for notifying the code inside of your thread that it should
exit. See my other post on your thread to see why your object isn't being
cleaned up -- there's a reference to it as long as the thread is alive, so
you have to cause the thread to exit.

BTW> Finalizers _are_ bad. They cause your object to live through at least
one GC more than it would have otherwise. That guarantees that the object
will be promoted to at least one generation older than it really should be.
If it causes your object to be promoted to Generation 2, it could stay in
memory for a very long time before it is collected. Also, if your object
references other objects, they will be held in memory as well (because there
are live references to them), so with one finalizer, you can artificially
promote a whole group of objects to a different genereration. That will
cause your application to hold on to memory longer than necessary, and it
will force the garbage collector to do more Gen 1 and (dreaded) Gen 2
collections in order to free memory.

If you want to end your thread in an object oriented way, put a Dispose or a
stop method on your thread manager object, and call it when you are finished
with it. In your dispose, you can signal the thread that it is time to
close.

Encapsulating the thread in your own thread manager object is a good
thing -- if you keep the relationship 1:1, you have a place to manipulate
the thread's state:

public class MyThreadMgr
{
private ManualResetEvent QuitEvent = new ManualResetEvent(false);

public void Stop()
{
// signal the reset event.
this.QuitEvent.Set();
}
public void Start()
{
System.Threading.Thread thread = new Thread(new
ThreadStart(this.Thread_ThreadStart));
thread.Start();
}
public void Thread_ThreadStart()
{
// this runs in the secondary thread
// i'm assuming it runs in some kind of loop
// sleeps, and then runs again
while(true)
{
// wait on the quit event
// accomplishes the sleep with an abort
// from sleep on quit. sleep 10 sec
if(this.QuitEvent.WaitOne(10000, false))
{
// perform cleanup code
return;
}
else
{
do your thread work
}
}
}
}
 
M

Martin Maat

J.Marsch said:
If that's right, there is no special GC behavior going on, there is just a
valid reference to your management object until the thread goes away. So,
your object is held in memory.

I suppose so. I cannot null the Target property to see if it makes any
difference, it is read-only.
If you want to end your thread in an object oriented way, put a Dispose or a
stop method on your thread manager object, and call it when you are finished
with it. In your dispose, you can signal the thread that it is time to
close.

Yes, that is what I a doing. I just felt it would make the object more
robust if it were to end the thread nicely before it was collected in case
some client forgot to stop it. I am not too worried about 2nd generation
garbage, the object we are talking about is the main singleton object for
the application, it is initialized soon after application startup and stays
active until the end. This is my stop method:

public void Stop()
{
if ((paceMakerThread != null) && (paceMakerThread.IsAlive))
{
keepStepping = false;
paceMakerThread.Join();
paceMakerThread = null;
}
}

keepStepping is the signal for the thread to stay in the while loop. I just
wanted to call Stop() from the finalizer to make sure everything comes to a
stop the way I want it to. I will just have to be a good client to my own
object now.

Thanks for your insights.

Martin.
 
J

J.Marsch

Martin:

I see what you are are after. I think that's about as good as it gets --
you'll have to signal the thread to stop instead of being signalled that the
application has stopped. If you don't want to do the stop in the main form,
the only other thing that I can think of would be to signal the thread to
stop in the Application.ApplicationExit event -- same deal, though.

Oh, one other thing that I might caution you on you might have a very subtle
threading bug:
Is your keepStepping variable just a bool field, or are you doing something
thread safe (locking) in the property set?

Reason:
If it's just a plain old variable, and you are not doing any locking: Even
though you are only writing to it from your main thread and reading in the
secondary thread, you might get unpredicatble behavior if you run your code
on a multi-processor machine.

The reason is that memory is not flat, even though it looks flat to us --
that variable might be in the L1 or L2 cache on one or more processors.

So, suppose that your main thread happens to be running on CPU 0 when it
sets keepStepping = false;. That value might be stored in the CPU 0's L2
cache. (in main memory, at this moment in time, the value is still true)

Now suppose that your secondary thread happens to be executing on CPU1
(either thread can execute on either CPU, and they can switch freely). When
it checks the value of keep stepping, that value will come from either
CPU1's cache or from main memory. The value of the variable is only false
in CPU 0's main memory, so your secondary thread "sees" true, and it keeps
on running.

Now, that would seem to make threading a complete mess. The trick is that
when you enter a lock {} that clues some very low-level code in that you are
accessing a resource that is shared by more than one thread, and some code
executes to ensure that the CPU caches are consistent. So even though it
doesn't look like your simple assignment needs to be synchronized, it should
be.

Alternately, you can mark your keepStepping variable as a volatile, which (I
believe) prevents it from being cached outside of main memory:
private volatile bool keepStepping;

For more information that you probably want, go here:
http://discuss.develop.com/archives/wa.exe?A2=ind0203B&L=DOTNET&P=R375
 
M

Martin Maat

[threading & simple flag variables in multi-processor environments]
Now, that would seem to make threading a complete mess.

Yes! :)
The trick is that when you enter a lock {} that clues some very
low-level code in that you are accessing a resource that is shared
by more than one thread, and some code executes to ensure that
the CPU caches are consistent. So even though it doesn't look
like your simple assignment needs to be synchronized, it should be.
Alternately, you can mark your keepStepping variable as a volatile,
which (I believe) prevents it from being cached outside of main
memory: private volatile bool keepStepping;

That seems like an elegant solution. You may have saved me a headache (to
occur in a couple of years when multi-processor machines become more widely
used). As I undertand from the help text on volatile, all caches are
effectively synchronized when any volatile variable is written. Thi would
fix any issues with keepStepping (which is a simple bool).

Whow! More homework. Memory barriers... I think I'll stick with locks and
the volatie keyword for now.

Thanks for raising the issue.

Martin.
 
J

J.Marsch

Glad to help. Good luck with your project!

-- Jeremy


Martin Maat said:
[threading & simple flag variables in multi-processor environments]
Now, that would seem to make threading a complete mess.

Yes! :)
The trick is that when you enter a lock {} that clues some very
low-level code in that you are accessing a resource that is shared
by more than one thread, and some code executes to ensure that
the CPU caches are consistent. So even though it doesn't look
like your simple assignment needs to be synchronized, it should be.
Alternately, you can mark your keepStepping variable as a volatile,
which (I believe) prevents it from being cached outside of main
memory: private volatile bool keepStepping;

That seems like an elegant solution. You may have saved me a headache (to
occur in a couple of years when multi-processor machines become more widely
used). As I undertand from the help text on volatile, all caches are
effectively synchronized when any volatile variable is written. Thi would
fix any issues with keepStepping (which is a simple bool).

Whow! More homework. Memory barriers... I think I'll stick with locks and
the volatie keyword for now.

Thanks for raising the issue.

Martin.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top