doing some background work occasionally

P

Peter Duniho

Brian Gideon said:
Typically I would I agree with you. The problem I see with the
ThreadPool is that the order the log messages are processed would be
nondeterministric because it's using multiple threads.

Using a shared queue doesn't fix that. Logging to the queue can still be
nondeterministic, as long as the work is still being done using multiple
threads.
It's usually a
requirement to process log messages serially instead of concurrently so
that they get written to a file (or whatever) in time order.

I'm not sure about "usually". I'll agree with "sometimes". :) In any
case, we haven't been given the requirement or lack thereof one way or the
other.

Pete
 
J

jcreasy

Very true. I was going under the assumption that these log messages
rarely needed to be entered. You could also get around that problem by
including a DateTime object into the entry, but a seperate dedicated
thread would help ensure the timing of your log entries. Another
possible problem with the ThreadPool is there are only 25 threads I
believe available to use, but that restriction I also assumed would not
be a problem in this setting. Thanks for your thoughts Brian.
 
J

jcreasy

Very true. I was going under the assumption that these log messages
rarely needed to be entered. You could also get around that problem by
including a DateTime object into the entry, but a seperate dedicated
thread would help ensure the timing of your log entries. Another
possible problem with the ThreadPool is there are only 25 threads I
believe available to use, but that restriction I also assumed would not
be a problem in this setting. Thanks for your thoughts Brian.
 
B

Brian Gideon

Peter said:
You appear to be suggesting to simply use a queue for the logging itself.
However, you haven't suggested any mechanism by which the queue itself will
be guaranteed to be in order.

I was suggesting the use of a FIFO queue. It's impossible for such a
queue to store items in anything but temporal order.
Now, it happens that the original poster never suggested that the logged
messages need to be in exactly the same order in which they occurred. It
seems likely that if he's dealing with multiple concurrent tasks that are
not otherwise synchronized with each other, then it doesn't really matter
whether the logged output from those tasks is ordered as well.

*But*...if ordering the logged messages *is* important, as you seem to be
assuming, then simply implementing a queue for the logging doesn't resolve
the out-of-order issue. It just moves it from the i/o part of the code to
the queueing part of the code.

You're right. The OP never said that was a requirement. I made the
leap myself because it's a reasonable assumption to make. It would be
weird for the application to do something in a particular order and
then report that it happened in another.
Queueing the *work* itself in a single thread would resolve the ordering
issue with the logged data (as would timestamping the data, as long as it's
sorted once all processing is done), but a) the original poster hasn't
suggested that the logged data needs to be in order, and b) the original
poster hasn't suggested that it's suitable for each work item to be done
serially (he may prefer that a relatively shorter work item started after a
longer one be allowed to run concurrently, so it can complete before the
longer one does).

I'm not understanding where sorting comes into play or why the OP would
not want log messages to appear in the log in the order they occurred.
Regardless, if the OP doesn't care about the order then certainly the
ThreadPool would be the easiest solution.
 
B

Brian Gideon

Brian said:
I was suggesting the use of a FIFO queue. It's impossible for such a
queue to store items in anything but temporal order.


I just realized what point you were making. You're right, if there are
multiple threads producing log messages then it would be *very*
difficult to guarentee ordering across all threads. It would not be
too difficult to make the guarentee within a thread though. It would
certainly require more than a trivial queue implementation though.
But, I didn't get the impression from the OP that more than one thread
would be producing log messages. In fact, with all of the talk about a
"main" thread and using the ISynchronizeInvoke methods it sounded like
only 1 was in play. It could very well just be me though :)
 
P

Peter Duniho

Brian Gideon said:
I just realized what point you were making. You're right, if there are
multiple threads producing log messages then it would be *very*
difficult to guarentee ordering across all threads.

I'm glad you now understand what I was saying. :)
It would not be
too difficult to make the guarentee within a thread though.

But it's not difficult to make the guarantee within a thread when writing
directly to a log file either. The question of whether errors are logged
directly to a file or put into an in-memory queue first is orthogonal to the
question of how to ensure that the logged entries are in the correct order.
As far as ordering goes, any issues that exist with respect to logging to a
file also exist with respect to logging to queue (and vice a versa).
It would
certainly require more than a trivial queue implementation though.

Well, I consider timestamping the log entries to be a pretty trivial
solution. Other than that, it's pretty much *impossible* to ensure the
queue entries are ordered. Because there's no way to cause an error and the
logging of that error to be an atomic operation, there is *always* the
possibility that one thread will be interrupted between an error occurring
and the error being logged.

Note that even the timestamping solution doesn't really completely guarantee
the logged events are in the correct order, since a thread could even finish
its timeslice just before retrieving the time for the logged error.

On the bright side, as I mentioned before, when one has multiple threads
operating concurrently doing work independent of each other, it would be
*highly* unusual for anyone to care that errors (or other events) logged by
each individual thread are put into the log in precisely the time order in
which they occurred. Only when the threads are somehow working together is
it likely someone would care about the exact order of logged events, and in
that case, the threads can also work together to ensure that order (and yes,
you're right, in that case the implementation of the logging queue would be
non-trivial).
But, I didn't get the impression from the OP that more than one thread
would be producing log messages.

If you say so. :) Personally, I think it a bit hard to see why you would
say that, at the same time that you talk about the problems of keeping a log
in order. The latter problem occurs only when there is more than one thread
logging. But I'll take your word for it that somehow you had both mutually
exclusive circumstances in mind. :)

Pete
 
B

Brian Gideon

Peter said:
But it's not difficult to make the guarantee within a thread when writing
directly to a log file either. The question of whether errors are logged
directly to a file or put into an in-memory queue first is orthogonal to the
question of how to ensure that the logged entries are in the correct order.
As far as ordering goes, any issues that exist with respect to logging to a
file also exist with respect to logging to queue (and vice a versa).

I did some more thinking on this. I misspoke. My solution already
already guarentees relative ordering for every thread. That's because
it's impossible for a thread to enqueue messages out of order relative
to itself. And since there's only one thread dequeueing it's
impossible for them to be dequeued out of order.
On the bright side, as I mentioned before, when one has multiple threads
operating concurrently doing work independent of each other, it would be
*highly* unusual for anyone to care that errors (or other events) logged by
each individual thread are put into the log in precisely the time order in
which they occurred. Only when the threads are somehow working together is
it likely someone would care about the exact order of logged events, and in
that case, the threads can also work together to ensure that order (and yes,
you're right, in that case the implementation of the logging queue would be
non-trivial).

Hmm...I sort of agree. I do disagree on one important point. No one
cares that thread A races with thread B when logging. It doesn't
matter that if A does something first and then B does its thing next
that B's log message appears first in the log. What people do care
about is that A's log messages are written to the log in the order that
they occurred. For example, if A performs tasks 1 and 2 then the log
message for 1 should be written before the log message for 2. A
solution using the ThreadPool won't guarentee that because the
persisting of log messages can be dispatched to different threads.
Contrast that with my solution where log messages are dispatched to a
single thread.

A little code might help.

public class Logger
{
private Thread thread;
private BlockingQueue queue;

public Logger()
{
queue = new BlockingQueue();
thread = new Thread(this.ThreadMethod);
thread.Start();
}

public void Log(string message)
{
queue.Enqueue(message);
}

private void ThreadMethod()
{
while (true)
{
string message = queue.Dequeue();
// Persist the message to a file, database, etc.
}
}
}

Notice that since there is only one thread removing from the queue the
order that messages are received by the Logger is the order they are
persisted. And it's impossible for any specific thread to queue its
messages out of order.

Brian
 
P

Peter Duniho

Brian Gideon said:
I did some more thinking on this. I misspoke. My solution already
already guarentees relative ordering for every thread. That's because
it's impossible for a thread to enqueue messages out of order relative
to itself.

No message logging implementation should ever have a problem with message
*from a single thread* being out of order. Someone would have to go to
*extra* work to make that a possibility.

So I don't really see what you're trying to say here. You might as well say
that a thread never has to worry about the program statements executed by
that thread ever executing out of order (ignoring for a moment CPU
implementations that do just that). It's trivially true, but not all that
interesting or useful to know.

In any case, whatever guarantees you can make using a "queue message, write
to log file later" implementation, you can just as easily make using a
"write to log file immediately" implementation. The underlying
implementation doesn't affect the question of what order log entries occur
in.
[...]
Hmm...I sort of agree. I do disagree on one important point. No one
cares that thread A races with thread B when logging. It doesn't
matter that if A does something first and then B does its thing next
that B's log message appears first in the log.

That's exactly what I said. How are you disagreeing with me?
What people do care
about is that A's log messages are written to the log in the order that
they occurred.

Yes, they do care about that. However, that happens naturally in any
typical message logging implementation. Since the statements within a given
thread always execute in order, it is trivial to ensure that logged messages
from a given thread are always in order.
For example, if A performs tasks 1 and 2 then the log
message for 1 should be written before the log message for 2. A
solution using the ThreadPool won't guarentee that because the
persisting of log messages can be dispatched to different threads.

I have no idea why you think that the "persisting of log messages" would be
"dispatched to different threads". Certainly no one here has suggested
anything like that. You'd have to go to extra work to do that. The threads
aren't present for the purpose of logging messages...they are present for
the purpose of doing work. Any messages they log will necessarily occur in
the correct order, relative to each thread's own work.
Contrast that with my solution where log messages are dispatched to a
single thread.

Why dispatch a log message to a thread at all? Messages should be logged to
a data structure, if not written directly to a file, shared (and
synchronized, of course) by all threads using it.
A little code might help.
[...]
Notice that since there is only one thread removing from the queue the
order that messages are received by the Logger is the order they are
persisted. And it's impossible for any specific thread to queue its
messages out of order.

You could just as easily replace your queue and thread with a single "Log"
method that does the "Persist the message to a file, database, etc" work you
have delegated to a whole new thread. It would work just as well, from a
message ordering standpoint.

Pete
 
B

Brian Gideon

Peter said:
No message logging implementation should ever have a problem with message
*from a single thread* being out of order. Someone would have to go to
*extra* work to make that a possibility.

But, that's the exact problem you get by using a ThreadPool
implementation for an asynchronous logger.
So I don't really see what you're trying to say here. You might as well say
that a thread never has to worry about the program statements executed by
that thread ever executing out of order (ignoring for a moment CPU
implementations that do just that). It's trivially true, but not all that
interesting or useful to know.

I don't think it's that trivial. But anyway, I wanted to mention it to
eliminate any confusion.
In any case, whatever guarantees you can make using a "queue message, write
to log file later" implementation, you can just as easily make using a
"write to log file immediately" implementation. The underlying
implementation doesn't affect the question of what order log entries occur
in.

I don't think so. ThreadPool.QueueUserWorkItem will queue the message
and write later, but do so in an unpredictable order. A "write to log
file immediately" approach would always result in a correctly ordered
file.
[...]
Hmm...I sort of agree. I do disagree on one important point. No one
cares that thread A races with thread B when logging. It doesn't
matter that if A does something first and then B does its thing next
that B's log message appears first in the log.

That's exactly what I said. How are you disagreeing with me?

I apologize. I misunderstood what you said then. In fact, I went back
and reread your post and can see that we agree on this.
Yes, they do care about that. However, that happens naturally in any
typical message logging implementation. Since the statements within a given
thread always execute in order, it is trivial to ensure that logged messages
from a given thread are always in order.

We're not discussing a typical logger implementation though.
I have no idea why you think that the "persisting of log messages" would be
"dispatched to different threads". Certainly no one here has suggested
anything like that. You'd have to go to extra work to do that. The threads
aren't present for the purpose of logging messages...they are present for
the purpose of doing work. Any messages they log will necessarily occur in
the correct order, relative to each thread's own work.

The log messages have to be dispatched to some asynchronous mechanism
for them to be written asynchronously without blocking the application
thread. The ThreadPool is an excellent mechanism for asynchronous
processing which dispatches work items to several threads. However, it
doesn't guarentee a completion order for those work items. That's the
issue I have with it.
Why dispatch a log message to a thread at all? Messages should be logged to
a data structure, if not written directly to a file, shared (and
synchronized, of course) by all threads using it.

Because the OP wants an asynchronous logger. There's no way that I
know of to prevent one thread from blocking without the use of another
thread, thread pool, fiber, IO completion port, or some other type of
asynchronous mechanism.
You could just as easily replace your queue and thread with a single "Log"
method that does the "Persist the message to a file, database, etc" work you
have delegated to a whole new thread. It would work just as well, from a
message ordering standpoint.

That would block the logging thread which is what the OP wanted to
prevent. It would satisfy the ordering requirement (which I claim is
necessary), but wouldn't satisfy the asynchronous requirement.
 
P

Peter Duniho

Brian Gideon said:
Because the OP wants an asynchronous logger.

I have no idea where you got that idea. The original poster was asking
about how best to do "some background work occasionally" (see the subject
line, if not his original post). There has been no mention whatsoever about
the *logging* itself occurring asyncronously. Only that the worker threads
should be able to log errors. In fact, so far the *only* event that the OP
has stated a need to log is an error that would end the worker thread's
work; obviously this need not be asynchronous.

Frankly, I'm a bit dubious that one could do any better creating new threads
(or using a thread pool) to log data than to simply use some synchronized
object (queue, file, whatever). Especially an in-memory queue, but even a
file, is not going to remain blocked for very long. The time it would take
to start up a new thread, or even to unblock an existing one and get it to
start running some code is easily in the same ballpark as, if not worse
than, the time a thread would be expected to take to add a new bit of data
to a queue or write it to disk (remember...file i/o is normally
cached...even if you're dealing with a slow i/o device, it is not likely
that the thread doing the writing is actually going to have to wait for that
as long as the data is reasonably small, which it would have to be if the
programmer expects to get decent performance out of ANY logging mechanism).

But, even if you make the assumption that running the *logging* itself on a
different thread is desirable in some situations, there's nothing about this
thread that suggests that's the design goal here.
[...]
That would block the logging thread which is what the OP wanted to
prevent.

Again, I've seen nothing in this thread to suggest that the OP wants to
avoid blocking the logging thread.

Pete
 
M

Marc Gravell

There has been no mention whatsoever about the *logging* itself occurring
asyncronously

see the first post:

OP> i.e. Logging, wich must not block the main thread.

Logging was only an example

Marc
 
M

Marc Gravell

And for the grammar police: OK - fair enough "i.e." does not mean this is an
example; I inferred (from context) that this meant "e.g."...

Marc
 
B

Brian Gideon

Peter said:
I have no idea where you got that idea. The original poster was asking
about how best to do "some background work occasionally" (see the subject
line, if not his original post). There has been no mention whatsoever about
the *logging* itself occurring asyncronously. Only that the worker threads
should be able to log errors. In fact, so far the *only* event that the OP
has stated a need to log is an error that would end the worker thread's
work; obviously this need not be asynchronous.

The OP did specifically mention that logging should not block the main
thread.
Frankly, I'm a bit dubious that one could do any better creating new threads
(or using a thread pool) to log data than to simply use some synchronized
object (queue, file, whatever). Especially an in-memory queue, but even a
file, is not going to remain blocked for very long. The time it would take
to start up a new thread, or even to unblock an existing one and get it to
start running some code is easily in the same ballpark as, if not worse
than, the time a thread would be expected to take to add a new bit of data
to a queue or write it to disk (remember...file i/o is normally
cached...even if you're dealing with a slow i/o device, it is not likely
that the thread doing the writing is actually going to have to wait for that
as long as the data is reasonably small, which it would have to be if the
programmer expects to get decent performance out of ANY logging mechanism).

Yes, that I can definitely agree with. I'm also unclear as to why
asynchronous logging is a requirement. Most people write logs to the
console, file, or event log which are both reliable and fast. But, I
have seen some who like to write logs to a file on a network share,
database, or some other remote resource. In that case an asynchronous
logger could be considered imperative.
But, even if you make the assumption that running the *logging* itself on a
different thread is desirable in some situations, there's nothing about this
thread that suggests that's the design goal here.
[...]
That would block the logging thread which is what the OP wanted to
prevent.

Again, I've seen nothing in this thread to suggest that the OP wants to
avoid blocking the logging thread.


Strange. I thought the theme of an asynchronous logger pervaded the
OP's entire post.
 
P

Peter Duniho

Brian Gideon said:
[...]
Strange. I thought the theme of an asynchronous logger pervaded the
OP's entire post.

Well, I didn't. I do see how the word "logging" got included as something
the worker thread would do, and I admit that I was not focusing on that with
respect to my replies. But I think "pervade" is overstating things, and the
fact that I didn't pick up on that didn't appear to concern the original
poster in my replied.

That said, I agree that if the original poster intends to use the worker
threads only for logging, *and* if he intends to assign a new thread each
time he wants to log something, he's heading for trouble, and in exactly the
way you suggest.

Thanks,
Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top