thread-specific use of Debug object

J

Jon Sequeira

Is there a way to have a debug listener "listen in" on only one thread?

I'd like to selectively get debug output only from certain threads, and
have each thread's debug output go into a separate log. Is this possible
using the Debug object?

Any insights are appreciated.

Thanks.

--Jon
 
D

Dave

You could use System.Threading.Thread.CurrentThread.GetHashCode() to
distinguish one thread from another, and then base your log decisions on
that.
 
J

Jon Skeet

Dave said:
You could use System.Threading.Thread.CurrentThread.GetHashCode() to
distinguish one thread from another, and then base your log decisions on
that.

You don't need to use the hashcode - just use the Thread.CurrentThread
reference itself, in the same way you would use any other object.
 
J

Jon Sequeira

The problem isn't identifying the executing thread, it's that the
System.Diagnostics.Debug.Write commands send output to all attached
DebugListeners, regardless of thread.
 
J

Jason Smith

I assume you mean "TraceListener," right? Anyway, it should be simple to
have your TraceListener filter out anything that isn't coming from a
specific thread. Yes, it will get called for every Debug.Write, but you can
ignore anything you don't want.

Maybe. Haven't actually tried this, but should work in theory.

Jon Sequeira said:
The problem isn't identifying the executing thread, it's that the
System.Diagnostics.Debug.Write commands send output to all attached
DebugListeners, regardless of thread.
 
J

Jon Skeet

Jon Sequeira said:
The problem isn't identifying the executing thread

Sure - it's just I've seen a lot of posts over the last year or so
where people seem to think you need to take the hashcode of a thread in
order to know which one it is. I wanted to correct that idae.
it's that the
System.Diagnostics.Debug.Write commands send output to all attached
DebugListeners, regardless of thread.

Well, they *call* all of the attached DebugListeners. It's up to the
listeners to decide whether or not they want to write any output for
that call.
 
J

Jon Sequeira

So then where does the code go in the TraceListener to intercept debug
messages and selectively write or prevent them from being written? Would you
suggest creating a class that inherits from TraceListener and overriding the
Write(...) functions? Or is there a simpler way I'm overlooking?

Thanks.

--Jon
 
J

Jon Skeet

Jon Sequeira said:
So then where does the code go in the TraceListener to intercept debug
messages and selectively write or prevent them from being written? Would you
suggest creating a class that inherits from TraceListener and overriding the
Write(...) functions? Or is there a simpler way I'm overlooking?

Writing your own TraceListener is exactly the way to go, IMO.
 
D

Dave

How would you use that to determine which file to log an event to?
Jon Skeet said:
You don't need to use the hashcode - just use the Thread.CurrentThread
reference itself, in the same way you would use any other object.

I wasn't referring to using the thread object itself but to using the
hashcode as an index into a table of file names and actions. An alternative
is to use thread local storage for the same purpose. A central routine could
then make a decision about log/no log based on that data.
 
J

Jon Skeet

Dave said:
I wasn't referring to using the thread object itself but to using the
hashcode as an index into a table of file names and actions.

But the only benefit of using the hashcode instead of the thread
reference itself as the key into the table would be to allow the thread
object to be garbage collected - and there are better ways round that
which don't give the possibility of hashcode collisions.
An alternative
is to use thread local storage for the same purpose. A central routine could
then make a decision about log/no log based on that data.

That would indeed be a more straightforward way of doing it.
 
D

Dave

Jon Skeet said:
But the only benefit of using the hashcode instead of the thread
reference itself as the key into the table would be to allow the thread
object to be garbage collected - and there are better ways round that
which don't give the possibility of hashcode collisions.
There is no possibility of a hashcode collision - each thread has a unique
hashcode. The issue of garbage collecting a thread object is unrelated to
this.

That would indeed be a more straightforward way of doing it.
It's a different way. It also creates garbage collection issues about the
data stored in the TLS.
 
J

Jon Skeet

There is no possibility of a hashcode collision - each thread has a unique
hashcode.

I would suggest it's only likely to be unique while the thread is
alive. In other words, no two threads which are currently alive (in any
way at all) will have the same hashcode, but I'd expect that a new
thread might have the same hashcode as a thread which has long since
expired.

In fact, I can prove that with a test program:

using System;
using System.Threading;

public class Test
{
public static void Main(string[] args)
{
for (int i=0; i < 10; i++)
{
Thread t = new Thread(new ThreadStart(NoOp));
t.Start();
Console.WriteLine (t.GetHashCode());
Thread.Sleep(1000);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
}
}

static void NoOp()
{
}
}

The above creates 10 separate threads, correct? And yet the output (on
my .NET 1.1 installation) is:

1
1
1
1
1
1
1
1
1
1

10 different threads (as far as .NET is concerned; they may or may not
have been running in the same thread as far as the OS is concerned)
with the same hashcode.
The issue of garbage collecting a thread object is unrelated to
this.

Not really - see above.
It's a different way. It also creates garbage collection issues about the
data stored in the TLS.

I would expect that the data could be garbage collected when the thread
had finished.
 
D

Dave

Jon Skeet said:
I would suggest it's only likely to be unique while the thread is
alive. In other words, no two threads which are currently alive (in any
way at all) will have the same hashcode, but I'd expect that a new
thread might have the same hashcode as a thread which has long since
expired.

In fact, I can prove that with a test program:

The results are what I would expect. The IDs are unique at a
given point in time; there is no guarantee made that the IDs will not be
reused/recycled.

In the context of the original message the fact that the thread ID is unique
during its lifetime is all that is required to make a decision. When the
thread terminates it should notify the central decision making logic of
that, and when a new thread is created it should register itself. The reuse
of the hashcode should not affect a properly designed system.




10 different threads (as far as .NET is concerned; they may or may not
have been running in the same thread as far as the OS is concerned)
with the same hashcode.

The results you saw were an artifact of the code itself. I ran a similar
test (with more logging) and saw the hash code toggle between a value of 2
and 3.

In each case the underlying system OS thread was unique. If you want to
see this yourself you can capture the system OS threads that get created
when you create a managed thread and dump their thread IDs.

....add routines like this to your test (no attempt was made to optimize the
perf of the code)...capture the process threads before and after a managed
thread is created, take the difference and dump the different thread IDs.

static void GetPThreads(ArrayList list)
{
list.Clear();
foreach ( ProcessThread pt in
System.Diagnostics.Process.GetCurrentProcess().Threads )
{
list.Add(pt.Id);
}
}
static ArrayList Difference(ArrayList origList,ArrayList newList)
{
ArrayList diff = new ArrayList();
foreach ( int id in newList )
{
if ( !ContainedIn(origList,id) )
diff.Add(id);
}
return diff;
}
static bool ContainedIn(ArrayList list,int id)
{
foreach ( int oid in list )
{
if ( oid == id )
return true;
}
return false;
}
static void DumpIDs(ArrayList list)
{
foreach ( int id in list )
Console.WriteLine("tid: {0}",id);
}

Not really - see above.

Thread objects are special critters in the runtime; thread references bleed
across appdomain boundaries.

I would expect that the data could be garbage collected when the thread
had finished.

Correct - it wont be collected until the thread has been collected.
 
J

Jon Skeet

Dave said:
The results are what I would expect. The IDs are unique at a
given point in time; there is no guarantee made that the IDs will not be
reused/recycled.

That to me doesn't make them unique then, and garbage collection most
certainly *is* relevant.
In the context of the original message the fact that the thread ID is unique
during its lifetime is all that is required to make a decision. When the
thread terminates it should notify the central decision making logic of
that, and when a new thread is created it should register itself. The reuse
of the hashcode should not affect a properly designed system.

That seems a lot more complicated (and bug-prone) than just using a
WeakReference or thread-local storage (potentially just using a
[ThreadStatic] attribute on a variable, rather than allocating the
storage manually). It also means you've got to have complete control of
all your threads in terms of the start and end of their lifetime - you
may not have that control.
The results you saw were an artifact of the code itself. I ran a similar
test (with more logging) and saw the hash code toggle between a value of 2
and 3.

Sure - but the important thing is that the hashcode isn't unique over
time.
In each case the underlying system OS thread was unique. If you want to
see this yourself you can capture the system OS threads that get created
when you create a managed thread and dump their thread IDs.

Right - although I *suspect* that isn't guaranteed, and I wouldn't like
to assume it.
...add routines like this to your test (no attempt was made to optimize the
perf of the code)...capture the process threads before and after a managed
thread is created, take the difference and dump the different thread IDs.


Thread objects are special critters in the runtime; thread references bleed
across appdomain boundaries.

Sure, but I don't see how that's relevant here. My point was that
garbage collection affects what a thread's hashcode will be.
Correct - it wont be collected until the thread has been collected.

Exactly. So what are the garbage collection issues of this compared
with the way which involves threads registering and "unregistering"
themselves?
 
D

Dave

Jon Skeet said:
That to me doesn't make them unique then,

If you want uniqueness after thread termination then I recommend generating
a GUID and associating it with a thread - you can convert it to a string and
set the thread name property to this. If uniqueness is required only during
the thread lifetime then an ID that is unique throughout the system while
the thread exists is all that is required; the hashcode satisfies this
condition. You seem to want something between these two - an id that is
unique for something more then the lifetime of a thread and something less
then globally unique.


and garbage collection most
certainly *is* relevant.

I never said it wasn't, but GC in unrelated to uniqueness. It is related to
cleanup when TLS is used.


In the context of the original message the fact that the thread ID is unique
during its lifetime is all that is required to make a decision. When the
thread terminates it should notify the central decision making logic of
that, and when a new thread is created it should register itself. The reuse
of the hashcode should not affect a properly designed system.


That seems a lot more complicated (and bug-prone) than just using a
WeakReference or thread-local storage (potentially just using a
[ThreadStatic] attribute on a variable, rather than allocating the
storage manually). It also means you've got to have complete control of
all your threads in terms of the start and end of their lifetime - you
may not have that control.
Using a weak reference is more bug prone. Using TLS can be bug prone.
Really, if registering/unregistering is too difficult for someone to get
right then using threads at all may be too dangerous.

My point was that if all that is needed is a simple identity test then the
hashcode will serve that purpose. If you need a more elaborate mechanism
with more guarantees (e.g. unique beyond thread lifetime) then there are
plenty of ways to do it.

Sure - but the important thing is that the hashcode isn't unique over
time.

The operating system itself does not make that guarantee. Neither does the
CLR.

As an example the NT OS maintains a process relative table of handles. The
value of the handle can be recycled as the process opens and closes files -
this does not make that handle any less unique during the execution of the
application. The OS does not get confused to what the handle refers to.
Right - although I *suspect* that isn't guaranteed, and I wouldn't like
to assume it.
I never assumed it; I showed you that in the current implementation it is
unique. When the runtime is rewritten to support fibers or some other thread
construct then this will probably change. The current implelementation
creates an OS thread each time you create a managed thread.

IDs.

Sure, but I don't see how that's relevant here. My point was that
garbage collection affects what a thread's hashcode will be.
My point is that GC is unrelated to uniqueness. If thread object references
bleed across appdomain boundaries (they do), and if hashcodes are unique
within an appdomain boundary (they are), then they are also unique across an
entire application. The actual value of the hashcode is meaningless - it is
just a number.
Exactly. So what are the garbage collection issues of this compared
with the way which involves threads registering and "unregistering"
themselves?
The only issue is one of timing - since you are relying on the runtime to
cleanup any references to objects stored in TLS then those objects will not
be collected for sometime after a thread has terminated, and totally at the
discretion of the runtime. There's nothing wrong with this, it's just
something to be aware of, e.g. there may be large objects that require
special handling, or objects on which you want to invoke a Dispose method.
 
J

Jon Skeet

Dave said:
If you want uniqueness after thread termination then I recommend generating
a GUID and associating it with a thread - you can convert it to a string and
set the thread name property to this. If uniqueness is required only during
the thread lifetime then an ID that is unique throughout the system while
the thread exists is all that is required; the hashcode satisfies this
condition. You seem to want something between these two - an id that is
unique for something more then the lifetime of a thread and something less
then globally unique.

I would imagine the OP would want something so that two threads which
could occur at different points in time would never end up accidentally
taking each other's settings. Using a hashcode which may be reused
requires extra work removing settings - using either TLS or a weak-
reference to the thread reference itself doesn't require as much extra
work.
I never said it wasn't, but GC in unrelated to uniqueness.

No it's not - if the garbage collector hadn't fired in my test, the
hashcodes would have remained unique, because the thread objects would
still have existed even if the threads themselves had finished running.
It is related to cleanup when TLS is used.
Agreed.
That seems a lot more complicated (and bug-prone) than just using a
WeakReference or thread-local storage (potentially just using a
[ThreadStatic] attribute on a variable, rather than allocating the
storage manually). It also means you've got to have complete control of
all your threads in terms of the start and end of their lifetime - you
may not have that control.
Using a weak reference is more bug prone.

I wouldn't have thought so. It requires the code in exactly one place,
rather than spread throughout the application wherever threads are
started/finished. It may be slightly harder to write that code in the
first place - but it's far from actually tricky.
Using TLS can be bug prone.

In what kind of way?
Really, if registering/unregistering is too difficult for someone to get
right then using threads at all may be too dangerous.

I disagree - it's very easy to just forget something a single time.
It's not that it's difficult to do, but it's easy to forget to do.
Also, as I said, it doesn't work well if you don't control the thread
intimately.
My point was that if all that is needed is a simple identity test then the
hashcode will serve that purpose. If you need a more elaborate mechanism
with more guarantees (e.g. unique beyond thread lifetime) then there are
plenty of ways to do it.

Yes, but I'd argue that registering/unregistering isn't a particularly
good way of doing it :)
The operating system itself does not make that guarantee. Neither does the
CLR.

As an example the NT OS maintains a process relative table of handles. The
value of the handle can be recycled as the process opens and closes files -
this does not make that handle any less unique during the execution of the
application. The OS does not get confused to what the handle refers to.

I'm sure it doesn't - but might some badly written code not get
confused? For instance, suppose we have one piece of code which opens a
handle and holds onto the value, and another piece of code which takes
that handle, closes it, and opens another file. That could have the
same handle - so the first piece of code could try to close it and
accidentally end up closing a completely different thing from what they
expected. (It might even be a completely different type of resource.)

The WeakReference idea would be relying on the garbage collector, which
is considerably smarter than any code I'm ever likely to write.
I never assumed it

Sorry, I didn't mean that I thought you were doing.
I showed you that in the current implementation it is
unique.
Yup.

When the runtime is rewritten to support fibers or some other thread
construct then this will probably change. The current implelementation
creates an OS thread each time you create a managed thread.
Right.
My point is that GC is unrelated to uniqueness.

But I still don't agree with that, when it comes to threads. If the GC
never ran, then all threads would have different hashcodes. If the GC
runs, their hashcodes are reused.
If thread object references
bleed across appdomain boundaries (they do), and if hashcodes are unique
within an appdomain boundary (they are), then they are also unique across an
entire application. The actual value of the hashcode is meaningless - it is
just a number.

Yes, but it's a number which may or may not be unique in time,
depending on GC.
The only issue is one of timing - since you are relying on the runtime to
cleanup any references to objects stored in TLS then those objects will not
be collected for sometime after a thread has terminated, and totally at the
discretion of the runtime. There's nothing wrong with this, it's just
something to be aware of, e.g. there may be large objects that require
special handling, or objects on which you want to invoke a Dispose method.

If you want to invoke a Dispose method you'll *certainly* need
something manual in order to get a timely clean-up - but I'd have
thought that would rarely be at the point of time when the thread exits
anyway. I would imagine that for the most cases using TLS would be less
bug-prone than any kind of manual cleanup.
 
D

Dave

I would imagine the OP would want something so that two threads which
could occur at different points in time would never end up accidentally
taking each other's settings.

It depends on what the code has to do. If all that is needed is a simple
pass-fail decision based
on thread id then *all* that is needed is for the thread to register that
when the thread is created. When
the thread terminates it will never invoke any method again, so its id will
not get used
and no confusion is possible. If another thread starts that reuses the same
id then it will
register the new pass-fail...no confusion occurs and settings do not get
hijacked.

I think we are disagreeing on the definition of uniqueness.


Using a hashcode which may be reused
requires extra work removing settings - using either TLS or a weak-
reference to the thread reference itself doesn't require as much extra
work.

See above, and I disagree about it not requiring as much work.

On a side note, TLS makes sense in the context of storing additional data
about a thread. Using a
weak reference to a thread object does not.

Weak references are useful when you can recreate the entire state of an
object if the strong reference gets collected; I don't believe that would
ever happen for a thread object while the thread was running, and if it did,
how would you possibly restore its state? How would you associate a new
thread object with an already
running thread? The clr could do it but how would you?

You could use a weak reference to some other object that stored additional
data about the thread, but again, how would you restore the state of a
decision that was made at the time the thread was created? If you have
sufficient data to be able to recreate it, then there is no need for an
additional object - you've already got
all you need.


No it's not - if the garbage collector hadn't fired in my test, the
hashcodes would have remained unique, because the thread objects would
still have existed even if the threads themselves had finished running.

That makes no difference. So my earlier post about the definition of
uniqueness.

Also, if the thread itself has finished running then there is no need to
hang onto the
thread object because the thread itself will not be making any method calls
that requires
access to the thread object's state.

I wouldn't have thought so. It requires the code in exactly one place,
rather than spread throughout the application wherever threads are
started/finished. It may be slightly harder to write that code in the
first place - but it's far from actually tricky.

I don't scatter thread startup/cleanup code around like buckshot - I
centralize that because
it's too important that it be correct.

In what kind of way?

In the win32 world you had to allocate and free the data slots. With the
CLR's named data slots the runtime *should* handle the cleanup of objects
stored in it but the slot itself is supposed to be freed.
With unnamed data slots I am not sure what the cleanup requirements are. I
could not find any documentation, on MSDN or google, that discusses the
implications of that. The docs *do* say...

"Threads use a local store memory mechanism to store thread-specific data.
The common language runtime allocates a multi-slot data store array to each
process when it is created. The thread can allocate a data slot in the data
store, store and retrieve a data value in the slot, and free the slot for
reuse after the thread expires. Data slots are unique per thread. No other
thread (not even a child thread) can get that data"...

and
"Slots allocated with this method must be freed with FreeNamedDataSlot."

This clearly implies that there is cleanup code required for thread
termination.

I disagree - it's very easy to just forget something a single time.
It's not that it's difficult to do, but it's easy to forget to do.

That's called a bug. I stand by my earlier statement.

Also, as I said, it doesn't work well if you don't control the thread
intimately.

If you are creating a thread then you certainly *should* control it
intimately.


I'm sure it doesn't - but might some badly written code not get
confused? For instance, suppose we have one piece of code which opens a
handle and holds onto the value, and another piece of code which takes
that handle, closes it, and opens another file. That could have the
same handle - so the first piece of code could try to close it and
accidentally end up closing a completely different thing from what they
expected. (It might even be a completely different type of resource.)
I'm sure badly written code gets confused by a lot of things, but if you
want to
defend the right for badly written code to work as its author *wanted* it to
work, rather
then the way he wrote it to work, then I certainly wont stop you :)

The WeakReference idea would be relying on the garbage collector, which
is considerably smarter than any code I'm ever likely to write.
See above. Weak references are great when you can discard the state of an
object
and recreate it again if need be. I don't believe that would work when you
are attempting to recreate a state when you do not have sufficient context
to recreate
the original calculations.

But I still don't agree with that, when it comes to threads. If the GC
never ran, then all threads would have different hashcodes. If the GC
runs, their hashcodes are reused.
Reuse does not mean they are not unique. Again, we are disagreeing about
the
definition of the term.

If you want to invoke a Dispose method you'll *certainly* need
something manual in order to get a timely clean-up - but I'd have
thought that would rarely be at the point of time when the thread exits
anyway.

Depends on what the requirements of the system are.
I would imagine that for the most cases using TLS would be less
bug-prone than any kind of manual cleanup.

Not necessarily. See above about cleaning up the TLS itself.
 
J

Jon Skeet

Dave said:
It depends on what the code has to do. If all that is needed is a simple
pass-fail decision based on thread id then *all* that is needed is for the
thread to register that when the thread is created. When the thread terminates
it will never invoke any method again, so its id will not get used
and no confusion is possible. If another thread starts that reuses the same
id then it will register the new pass-fail...no confusion occurs and settings
do not get hijacked.

That's fine so long as you control every single thread which is created
in the system. That sounds like an unlikely situation to me. Can you be
absolutely sure that *no* calls you make to any third-party library or
the framework itself will create any new threads? If a third-party
library creates a thread, it may have the same hashcode as an "old"
thread which registered itself, the code won't be able to tell the
difference between this third-party thread and the old thread.

On the other hand, just setting the pass/fail in TLS with a suitable
default being taken when reading the entry for threads which *haven't*
set anything doesn't have that problem.
I think we are disagreeing on the definition of uniqueness.

Yes. I'm talking about uniqueness across the lifetime of the
application, which seems like a much more useful form of uniqueness in
this situation.
See above, and I disagree about it not requiring as much work.

See above for the reason why your scheme doesn't work unless you
control every thread in the system.
On a side note, TLS makes sense in the context of storing additional data
about a thread. Using a weak reference to a thread object does not.

Well, it's a slightly more difficult way of doing it, but it's
essentially doing the same thing.
Weak references are useful when you can recreate the entire state of an
object if the strong reference gets collected;

Yes, but that's not the only situation in which they're useful.
I don't believe that would
ever happen for a thread object while the thread was running, and if it did,
how would you possibly restore its state? How would you associate a new
thread object with an already running thread? The clr could do it but how
would you?

I wouldn't. That's not how I was proposing to use it. The WeakReference
would effectively serve the same purpose as TLS - it would associate
some extra data with a Thread reference, but without preventing the
Thread object from being garbage collected.

That makes no difference. So my earlier post about the definition of
uniqueness.

I think we'll have to just agree to disagree on this part.
Also, if the thread itself has finished running then there is no need to
hang onto the thread object because the thread itself will not be making
any method calls that requires access to the thread object's state.

Indeed - which is why I suggested using a WeakReference, precisely
because it means that you can map data from a key reference to a value
without preventing the object the key reference refers to from being
garbage collected.
I don't scatter thread startup/cleanup code around like buckshot - I
centralize that because it's too important that it be correct.

But your scheme requires every bit of code which creates a thread to
register - otherwise it can get an old value. My scheme requires only
those threads which are genuinely interested in different behaviour
registering. In what way is your scheme easier?
In the win32 world you had to allocate and free the data slots. With the
CLR's named data slots the runtime *should* handle the cleanup of objects
stored in it but the slot itself is supposed to be freed.
With unnamed data slots I am not sure what the cleanup requirements are. I
could not find any documentation, on MSDN or google, that discusses the
implications of that. The docs *do* say...

"Threads use a local store memory mechanism to store thread-specific data.
The common language runtime allocates a multi-slot data store array to each
process when it is created. The thread can allocate a data slot in the data
store, store and retrieve a data value in the slot, and free the slot for
reuse after the thread expires. Data slots are unique per thread. No other
thread (not even a child thread) can get that data"...

and
"Slots allocated with this method must be freed with FreeNamedDataSlot."

This clearly implies that there is cleanup code required for thread
termination.

And I would certainly hope that any field marked with the
ThreadStaticAttribute is automatically cleaned up by the CLR -
otherwise it's a pointless attribute.

Admittedly I can't see it *documented* that the thread-local storage
allocated by it is automatically freed, but then it doesn't document
that you have to manually free it, either.

Do you have any real reason not to trust ThreadStaticAttribute?
That's called a bug. I stand by my earlier statement.

It's a different class of bug though. It's a bug of omission rather
than a bug of faulty logic, and
If you are creating a thread then you certainly *should* control it
intimately.

Who says that *you're* creating the thread though? You may not have
created the thread, and may not have control over its use later on, but
still want to register for it to act a particular way in terms of
tracing. Using TLS allows that. Your scheme doesn't.
I'm sure badly written code gets confused by a lot of things, but if you
want to defend the right for badly written code to work as its author
*wanted* it to work, rather
then the way he wrote it to work, then I certainly wont stop you :)

I wasn't trying to. However, isn't it better to not allow code to be
badly written in this way, simply by requiring less of the code to
start with? Using TLS attains that - only one bit of centralised code
needs to be correct to get the "default" behaviour, and threads which
want "non-default" behaviour have a single simple call to make, and are
guaranteed not to interfere with other threads.
See above. Weak references are great when you can discard the state of an
object and recreate it again if need be. I don't believe that would work
when you are attempting to recreate a state when you do not have
sufficient context to recreate the original calculations.

Then you've missed at least one use of weak references.
Depends on what the requirements of the system are.


Not necessarily. See above about cleaning up the TLS itself.

I would rather trust the CLR's cleanup of TLS allocated automatically
using the ThreadStaticAttribute than I would trust my system not to
create any threads I didn't specifically create myself.
 
D

Dave

That's fine so long as you control every single thread which is created
in the system. That sounds like an unlikely situation to me. Can you be
absolutely sure that *no* calls you make to any third-party library or
the framework itself will create any new threads?

If my only concern is to capture debug output generated by my own threads
then it is entirely suitable.
If a third-party
library creates a thread, it may have the same hashcode as an "old"
thread which registered itself, the code won't be able to tell the
difference between this third-party thread and the old thread.

That depends entirely on the requirements. If controlled access from all
possible threads is a requirement then I would use a different mechanism. If
the only threads I need to concern myself with are threads that I have
created and have control over, then my method is entirely suitable. Since
the original question never stated how the debug output was to be generated
we were never working from the same page.

I think you and I are disagreeing because we each had different unstated
requirements.
Yes. I'm talking about uniqueness across the lifetime of the
application, which seems like a much more useful form of uniqueness in
this situation.

How long is the lifetime of your app?

If you want uniqueness across the lifetime of an app then using a weak
reference is questionable. If the app lifetime is measured in months and the
rate of generating new references is high then if you have to retain a weak
reference in order to guarantee uniqueness then you will eventually run the
system out of memory. If that is your requirement I would not use a weak
reference in a hashtable; I would generate a guid and store it in TLS.

If the only requirement is uniqueness during the lifetime of the thread then
all that is required is a number that is unique during the thread's
lifetime, which is much shorter then the potential lifetime of the app. The
runtime has no trouble using the hashcode to determine uniqueness; neither
should our code.

In win32 there were notifications the app would receive whenever a thread
was created or destroyed - there is no such mechanism in dotnet but there
should be. Then we could use whatever mechanism we wanted for all threads
regardless of where they were created.
See above for the reason why your scheme doesn't work unless you
control every thread in the system.

No, it only requires that you control every thread whose output you wanted
to monitor, and that is exactly what I was assuming. Without a complete
statement of requirements we are each free to make different assumptions,
and we did.
Well, it's a slightly more difficult way of doing it, but it's
essentially doing the same thing.
No, it is very different. With TLS the data itself is locked to the object -
with a weak reference there is no such guarantee - it's a reference to an
object which may cease to exist.

Yes, but that's not the only situation in which they're useful.
Correct but in this situation it is not a method I would use.
I wouldn't. That's not how I was proposing to use it. The WeakReference
would effectively serve the same purpose as TLS - it would associate
some extra data with a Thread reference, but without preventing the
Thread object from being garbage collected.
Except that it increases memory pressure on the system. TLS is a far
superior mechanism to use.

But your scheme requires every bit of code which creates a thread to
register - otherwise it can get an old value. My scheme requires only
those threads which are genuinely interested in different behaviour
registering. In what way is your scheme easier?
Not really. It depends more on how output is captured. I usually use my own
logging/capturing classes so I have total control over this. I usually need
some way of setting priorities, levels of output (verbose, terse) etc, and
this requires a finer degree of control. Since threads come and go I usually
have to make a call from within the thread when it initializes to set the
logging controls. Adjusting the knobs is a necessary evil.

If you wanted to trace a transaction through a heavily multi-threaded
environment you would need a different mechanism anyway - you would need to
associate state with each transaction. And if you wanted to track
transactions across process and machine boundaries it's different again.

And I would certainly hope that any field marked with the
ThreadStaticAttribute is automatically cleaned up by the CLR -
otherwise it's a pointless attribute.

Admittedly I can't see it *documented* that the thread-local storage
allocated by it is automatically freed, but then it doesn't document
that you have to manually free it, either.

Do you have any real reason not to trust ThreadStaticAttribute?
I do trust that attribute but that is not the only means of using TLS. It
can be allocated dyncamically - that's the method I was referring to.

It's a different class of bug though. It's a bug of omission rather
than a bug of faulty logic, and
Failing to adhere to a requirement, such as not invoking Dispose on an
object that requires it, is a bug even though it is an omission to invoke a
method call rather then an abuse of a method.
Who says that *you're* creating the thread though? You may not have
created the thread, and may not have control over its use later on, but
still want to register for it to act a particular way in terms of
tracing. Using TLS allows that. Your scheme doesn't.
This goes back to differing assumptions. The original message specifically
mentioned capturing debug output...we then each made different assumptions
about what that meant.

Also, this is a pure out-and-out bug. If you want to pass a handle to some
other code you are supposed to make a duplicate handle to prevent this. It's
similar to COM pointers - you're supposed to AddRef it each time you make a
copy of it.

Then you've missed at least one use of weak references.

True, but it's one use I would never use anyway. I'd use a weak reference as
a weak reference.
 
J

Jon Skeet

Dave said:
If my only concern is to capture debug output generated by my own threads
then it is entirely suitable.

Nope. Here's the situation:

o You create a thread, and set it for debug
o The thread dies naturally
o A third party library creates a new thread, and it gets the same
hashcode as the first thread
o The third party library writes some debug output
o Oops - you incorrectly include it with your own output
That depends entirely on the requirements. If controlled access from all
possible threads is a requirement then I would use a different mechanism. If
the only threads I need to concern myself with are threads that I have
created and have control over, then my method is entirely suitable.

No, because threads you *don't* create will have the same hashcode as
other threads which you *hvae* created.
Since
the original question never stated how the debug output was to be generated
we were never working from the same page.
True.

I think you and I are disagreeing because we each had different unstated
requirements.

Perhaps, but I can't see that my simple system would fail any
requirements that yours would succeed for.
How long is the lifetime of your app?

If you want uniqueness across the lifetime of an app then using a weak
reference is questionable. If the app lifetime is measured in months and the
rate of generating new references is high then if you have to retain a weak
reference in order to guarantee uniqueness then you will eventually run the
system out of memory. If that is your requirement I would not use a weak
reference in a hashtable; I would generate a guid and store it in TLS.

The reference within the weak reference will be garbage collected,
however - you won't get any false positive matches.

(Yes, you then need to occasionally remove all the mappings which have
a WeakReference where the enclosed reference has been collected, but
that's simple.)
If the only requirement is uniqueness during the lifetime of the thread then
all that is required is a number that is unique during the thread's
lifetime, which is much shorter then the potential lifetime of the app. The
runtime has no trouble using the hashcode to determine uniqueness; neither
should our code.

It just seems to create an unnecessary weakness in the system.
In win32 there were notifications the app would receive whenever a thread
was created or destroyed - there is no such mechanism in dotnet but there
should be. Then we could use whatever mechanism we wanted for all threads
regardless of where they were created.

Yes, that would make things easier.
No, it only requires that you control every thread whose output you wanted
to monitor, and that is exactly what I was assuming. Without a complete
statement of requirements we are each free to make different assumptions,
and we did.

So are you happy to gather extra output from other threads? 'Cos that's
what will happen in the example I gave above.
No, it is very different. With TLS the data itself is locked to the object -
with a weak reference there is no such guarantee - it's a reference to an
object which may cease to exist.

Yes, at which point (or at some point thereafter) you remove the
mapping from that WeakReference to the data.
Correct but in this situation it is not a method I would use.

Evidently - but I still believe it would be simpler and more reliable
than your method. I think using TLS would be simpler still though.
Except that it increases memory pressure on the system. TLS is a far
superior mechanism to use.

As I've already agreed.
Not really. It depends more on how output is captured. I usually use my own
logging/capturing classes so I have total control over this. I usually need
some way of setting priorities, levels of output (verbose, terse) etc, and
this requires a finer degree of control. Since threads come and go I usually
have to make a call from within the thread when it initializes to set the
logging controls. Adjusting the knobs is a necessary evil.

If you wanted to trace a transaction through a heavily multi-threaded
environment you would need a different mechanism anyway - you would need to
associate state with each transaction. And if you wanted to track
transactions across process and machine boundaries it's different again.

I don't see how your scheme makes any of this easier though...
I do trust that attribute but that is not the only means of using TLS. It
can be allocated dyncamically - that's the method I was referring to.

Why though? Why not just use the ThreadStaticAttribute instead? It's
simple and reliable.
Failing to adhere to a requirement, such as not invoking Dispose on an
object that requires it, is a bug even though it is an omission to invoke a
method call rather then an abuse of a method.

I think you're missing my point. Let's drop this bit - we're never
going to agree on which scheme is likely to be buggier.
This goes back to differing assumptions. The original message specifically
mentioned capturing debug output...we then each made different assumptions
about what that meant.

Indeed - but I can't see any situation where your scheme gives *more*
control or reliability than using TLS, but I've given an example
(above) where it does something which you probably *don't* want (namely
capturing the output from a third-party library thread, simply because
it *happened* to get the same hashcode as a previous thread you
created).

[Reuse of handles]
Also, this is a pure out-and-out bug. If you want to pass a handle to some
other code you are supposed to make a duplicate handle to prevent this. It's
similar to COM pointers - you're supposed to AddRef it each time you make a
copy of it.

Yes, it's a bug. However, as I said before, it's easy with TLS to get
to the state where you can't *write* that kind of bug. I prefer
frameworks where I can't make mistakes to ones where I can.
True, but it's one use I would never use anyway. I'd use a weak reference as
a weak reference.

Eh? That use is using it as a weak reference. It's a perfectly valid
use, and a handy one to know about.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top