Threading advice: How to wait for all threads to complete withoutpolling.

K

keeling

Greetings all,

It is my understanding that polling is very bad (I could be wrong,
this is just my understanding). But I have a problem that I don't know
how to solve without polling.

I need to execute a series of mathematically intensive methods and
wait for them ALL to finish before moving on to the next step. I have
a machine with lots of cores and lots of memory so threading seems
like a great solution. To be exact the BackgroundWorker seems to be a
good solution because I would like to limit how many of these problems
are solved concurrently (to limit the hit on resources). Here is my
solution


MyMethod()
{
List<BackgroundWorker> myWorkers new List<BackgroundWorker>();
Add problems 1 through n to be solved to list as new
BackgroundWorkers
Limit the number of concurrent threads firing on the ThreadPool

foreach(BackgroundWorker bw in myWorkers)
{
bw.RunWorkerAsync();
}

while(myWorkers.Count > 0)
{
Sleep a little while (say 100)
Check if each of myWorkers are busy, if not, mark them for removal
Remove finished workers from myWorkers
}

Go on to next task that requires all of the above workers to be
finished first
}

But as I said, it is my understanding that this sort of polling is
very bad. Is this assumption correct? If so, what is the better way?

Randy
 
M

Mick Wilson

But as I said, it is my understanding that this sort of polling is
very bad. Is this assumption correct? If so, what is the better way?

The draw back you're talking about is that execution is switching to
the main thread, which has no real work to do, akin to busy waiting:

http://en.wikipedia.org/wiki/Busy_waiting

I'm not sure what advantages you're getting from the BackgroundWorker
(never used it), but Thread.Join() seems like a good solution for you.
Create thread objects for your worker methods (hanging on to the
references) and then join them all when you want to wait for
completion.

http://msdn.microsoft.com/en-us/library/95hbf2ta.aspx
 
K

keeling

The draw back you're talking about is that execution is switching to
the main thread, which has no real work to do, akin to busy waiting:

http://en.wikipedia.org/wiki/Busy_waiting

I'm not sure what advantages you're getting from the BackgroundWorker
(never used it), but Thread.Join() seems like a good solution for you.
Create thread objects for your worker methods (hanging on to the
references) and then join them all when you want to wait for
completion.

http://msdn.microsoft.com/en-us/library/95hbf2ta.aspx

The reason I went with the BackgroundWorker is that it uses the
ThreadPool. Because each worker could use a sizable amount of memory,
I want to be able to limit how many are firing at any given time. With
the ThreadPool I can limit this number while the waiting workers
remain queued. In other words I can can create all the workers and
forget about them because the ThreadPool will handle the queue.
 
M

Mick Wilson

The reason I went with the BackgroundWorker is that it uses the
ThreadPool. Because each worker could use a sizable amount of memory,
I want to be able to limit how many are firing at any given time. With
the ThreadPool I can limit this number while the waiting workers
remain queued. In other words I can can create all the workers and
forget about them because the ThreadPool will handle the queue.- Hide quoted text -

This example from MSDN shows how to signal the Main thread from a
Thread pool using the WaitHandle class. Each worker will signal when
its done (via manualEvent.Set();) and Main will proceed only when all
threads have signaled (via WaitHandle.WaitAll(manualEvents);)

http://msdn.microsoft.com/en-us/library/z6w25xa6.aspx
 
K

keeling

Polling's not great, that's for sure.  The problem is mitigated if you use  
Thread.Sleep(), and especially if you use it with a relatively large value  
(say, 1 second or more).  But IMHO it's always better to design something  
event-driven when possible.  And here, it should be possible.


BackgroundWorker is fine and useful as far as it goes.  However, IMHO it's  
a mistake to rely on it to implement your throttling strategy.  The thread  
pool has far too high a max thread count for it to be useful to avoid  
performance and memory consumption issues.  You can change the maximum, 
but other code in .NET uses the thread pool and so you could inadvertently  
hinder some other component in your attempt to limit the number of running  
things.

But, using BackgroundWorker for background tasks is otherwise a fine  
idea.  The real question is how to avoid a thread sit and wait for all the  
others (and especially your main thread, though you're not explicit about 
what thread is executing "MyMethod()" in your example).  The Join() and 
even the WaitHandle example mentioned by Mick aren't completely awful;  
they're certainly better than polling, and as long as they're done in a  
dedicated thread rather than your main GUI thread, that's fine.

But you can do it without having a separate thread at all sitting and just  
watching.  And IMHO, that's the best.  No wasted thread!  :)

There are actually a variety of techniques you can use, each one depending  
on what you feel comfortable with as well as how the rest of your program 
is architected (including the question of whether it's a GUI or console  
application).  You've got two different design goals: to limit the number  
of active threads, and to deal with the condition when all of the tasks  
have completed, but without having to poll the state.

If you've got a console application, the solution might be as simple as  
something _similar_ to the MSDN example Mick mentioned.  I say "similar"  
because it shows the main thread waiting for a different wait object for  
each worker thread.  This is a bit pointless, given that each worker is 
doing basically the same thing (and so can share some common logic).  It  
also is somewhat limiting, because you can only have so many WaitHandle  
objects that are waited for in one call to WaitAll() (I forget the exact  
limit, but I think it's 64 or so).

Instead of that example, I would modify it so that there's a single  
WaitHandle object and a counter shared among the worker threads.  The  
counter would be incremented for each worker thread, and each thread would  
decrement the counter when it's done.  After decrementing the counter, the  
thread would check to see if it's reached zero and if so, it would signal 
the one WaitHandle object.  (Of course, the counter accesses would be  
synchronized...using a lock() statement or the Interlocked class, for  
example).

If you have a GUI program, then you can't have your main GUI thread  
sitting there waiting on an event handle.  You can still do it without a  
dedicated "waiting" thread though.  Instead of setting a WaitHandle  
object, just use Control.Invoke() or Control.BeginInvoke() to run a method  
on the GUI thread once all of the tasks have completed.  Other than that  
particular switch, it'd be the same "count the number of running tasks,  
when gets back down to zero, signal" technique as above.

That still doesn't limit the number of threads though.  A very simple  
technique for doing that might be to just create a fixed number of  
BackgroundWorker instances (ideally, one per CPU core) and then have a  
separate queue of work items that each BackgroundWorker can run off of.  
About six months ago, I answered a different question with some sample  
code (actually, the other person's original code with some modifications) 
that does this sort of thing.  You can see the example here:http://groups.google.com/group/microsoft.public.dotnet.languages.csha...

That example starts up a bunch of BackgroundWorkers which then all try to 
exhaust a queue.  When they run out of stuff to do, they exit.  If you  
wanted to be able to queue up new stuff later, you could either keep track  
of how many active BackgroundWorkers you had and add new ones if you're  
not at your intended maximum yet, or you could just create some fixed  
number of non-exiting threads that just sit and wait for something to be  
added to the queue.  Note that both of these are essentially a way of  
implementing your own thread pool.  You'd get the basic features of the 
thread pool without having to interfere with other code that may depend on  
having more threads available in order to work correctly.

An alternative way to do the same sort of thing would be to instead of  
explicitly managing a specific count of threads, do it implicitly by just 
keeping track of how many active "tasks" are running, queuing tasks  
instead of starting them when you reach your intended limit, and having  
each task check to see if a new task needs to be started when it's  
finished.  Again, this is very similar to a thread pool solution, with the  
main difference being that in this case you never explicitly reuse a  
threading object, but instead do all the management through a queue and a 
count of active tasks.

As it happens, I also wrote a sample last year that does something like  
that.  That particular sample is somewhat more complicated than the one 
above, but to some extent that's because I built a little "simulator"  
class that pretends to do stuff, just so I'd have a way to demonstrating  
the interesting part of the code.  Of course, the "simulator" class takes  
up the bulk of the sample, even though it's really not part of the  
illustration per se.  :)

Anyway, you can find that sample here:http://groups.google.com/group/microsoft.public.dotnet.languages.csha...

The key thing to look at there is the "DownloadManager" class, which is  
where the logic for tracking the count of active tasks (in the samples's  
case, a "task" is a file download using asynchronous i/o, but this could  
easily be _any_ sort of tasks that you can just start and let run  
asynchronously), and providing for a queue of tasks that the  
"DownloadManager" class uses to maintain the processing of the work (queue  
items when there are already the intended number of "active" ones).

I think that covers it.  :)

Pete

Pete, Mick,

Thanks for the advice guys. You've given me a couple of areas to
research.

For the record, the application is a console app and each of these
mathematically intensive operations may spawn other operations that
could use threading logic.

Again, thanks for the pointers.

Randy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top