reuse threadpool item/object (not the thread, but the worker object)

K

Kevin

Using this:

http://msdn2.microsoft.com/en-us/library/3dasc8as(VS.80).aspx

as an example I have a question concerning the reuse of objects. In
the example 10 instances of the Fibonacci class are made and then all
put in the threadpool at once, which is well within the limits of the
threadpool.

However, what happens if you want to do 10,000 Fibonacci calculations
in the threadpool? You really dont want to make 10,000 instances of
the Fibonacci class as there would be considerable memory overhead
involved.

What you want to do is create a 'bucket' of say 30 (slightly bigger
than the maxthreads of the threadpool) Fibonacci objects, and a queue
(of size 10,000) of calculations (in this case 'n'). The program
would then take a calculation off the queue populate a Fibonacci
object with the 'n' and then stick it into the threadpool.

When the object is finished, it raises an event or whatever to say it
has finished and goes back in the bucket. The main thread then sees
that there is an object in the bucket and takes another calculation
off the queue and sends the object back into the threadpool.

I am pretty sure how I would achieve this, however, I wanted to know
if there were any best practices or exising examples for this kind of
thing that people could point me to? Google searches havent returned
anything really useful.

Many thanks
Kevin
 
G

Guest

if you created a new fibonacci class instance just prior to queueing the
workitem onto the Threadpool, wouldn't it be disposed once the calculation is
complete and that thread has returned, now being available for another
workitem?

So if the pool has say, 100 threads, there would never be more than 100
Fibonacci objects in existence, right?
Peter
 
P

Peter Duniho

Peter Bromberg said:
if you created a new fibonacci class instance just prior to queueing the
workitem onto the Threadpool, wouldn't it be disposed once the calculation
is
complete and that thread has returned, now being available for another
workitem?

So if the pool has say, 100 threads, there would never be more than 100
Fibonacci objects in existence, right?

I guess that depends on how the use of the thread pool was designed. If it
starts with the creation of a Fibonacci class instance, which then uses a
thread pool implementation that allows for worker methods to be queued, then
you get as many instances of the Fibonacci class as you ask for, as soon as
you ask for them. There will only be as many active threads running as
there are in the thread pool, but in the meantime you've still got all the
already-created Fibonacci class instances sitting around.

This is, in fact, how the sample code referred to operates. The Fibonacci
class instances are created outside the worker threads, and the thread pool
supports queuing of worker methods (as opposed to blocking when trying to
add a method when all of the threads in the pool are already busy). So no
matter how many Fibonacci instances are chosen, as long as you don't run out
of memory to store them all, they are all instantiated at once.

It would, of course, be possible to reverse the design so that a worker
method was added to the queue that itself instantiates the Fibonacci class
required to do the work. In that case, you would still have the overhead of
queuing up 10,000 worker methods (for example), but each Fibonacci class
would only be instantiated once a thread's been assigned to do some work.

But that's not how this particular sample code works.

Pete
 
P

Peter Duniho

Kevin said:
[...]
However, what happens if you want to do 10,000 Fibonacci calculations
in the threadpool? You really dont want to make 10,000 instances of
the Fibonacci class as there would be considerable memory overhead
involved.

I guess that depends on your design goal. As long as the class is small and
doesn't itself consume too much in the way of system resources, 10,000
instances may not be all that big of a deal. Even if each instance were 1K
in size (which is pretty huge for a basic data structure), you're only
talking 10MB of data.
What you want to do is create a 'bucket' of say 30 (slightly bigger
than the maxthreads of the threadpool) Fibonacci objects, and a queue
(of size 10,000) of calculations (in this case 'n'). The program
would then take a calculation off the queue populate a Fibonacci
object with the 'n' and then stick it into the threadpool.

Why make your bucket *any* larger than the size of the thread pool? If
that's the approach you want to take, then there's no need to instantiate a
Fibonacci object until you know it will be assigned a thread.

Note also that if you're going to throttle your use of the thread pool, then
you should actually be considering what can actually be *done*, work-wise.
In this particular example, the worker threads are all CPU-bound. That is,
they don't do anything except use the CPU. So there's really no point at
all in having more running worker threads than there are CPUs.

So yes, you could limit your queued work items to the number of threads in
the pool, but if you're going to do that, it actually makes more sense here
to limit your queued work items to the number of CPUs you've got.

If the work is more i/o-bound, then using more threads than CPUs can make
sense, but that has to be determined on a case-by-case basis.

I'll also point out that in this example anyway, the amount of data in the
Fibonacci class is pretty minimal. Two ints and a handle (12 bytes total,
plus whatever expense there is for memory management overhead). As I
mentioned above, even 1K data objects don't impose a large strain on modern
computers, even when you're talking about 10,000 instances. Having 10,000
instances of the Fibonacci class is an inconsequental tax on the memory
manager (though it raises a problem with respect to the main thread waiting
on all the workers...see below).

It seems to me that if you're concerned at all about optimizing the general
algorithm, it would make more sense to focus on the costs of the threads
involved. The memory overhead is minimal, while looking at the threads
involved can produce very real benefits (especially for CPU-bound tasks).
When the object is finished, it raises an event or whatever to say it
has finished and goes back in the bucket. The main thread then sees
that there is an object in the bucket and takes another calculation
off the queue and sends the object back into the threadpool.

I am pretty sure how I would achieve this, however, I wanted to know
if there were any best practices or exising examples for this kind of
thing that people could point me to? Google searches havent returned
anything really useful.

I don't know of any "best practices" document. However, it seems to me that
if you're trying to be optimal, the "best practice" will depend on the
specific task at hand.

For example, for this Fibonacci example, all you really need to store in
your queue is the input value for the calculation. In this case, it's
chosen randomly, so you don't even need to store it. Just keep track of
when it's time to "dequeue", and pick a new random number.

Even if you were doing the input values in sequence, you still only need to
maintain a single counter, incremented (or decremented if you prefer) each
time you "dequeue" a new work item.

Different tasks may require a different way to queue things up, requiring
more overhead (though presumably less overhead than instantiating the whole
worker class instance).

Of course, also keep in mind that one nice design characteristic of the
example is that it can "fire and forget" all of the worker items, and then
just wait on all of them. This makes the code itself a bit cleaner and
clearer. The side-effect to *this* of course is that you can't wait on an
arbitrarily large list of events; there's a system limit of 64 handles.
There are ways around this (one example would be to keep a counter of queued
items and wait on a single event handle in a loop, exiting the loop when the
counter reaches zero).

Hope that helps.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top