Uniquely Identifying Multiple/Concurrent Async Tasks

  • Thread starter Thread starter Frankie
  • Start date Start date
F

Frankie

It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.

The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSuppliedState) method... with
userSuppliedState being, more or less, a taskId.

In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and yes,
I understand that System.Random doesn't provide *truly* random values).

Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?

Thanks!
 
Frankie,

If you really need something that is pretty much guaranteed (but not
completely) to be random, and unique, then I suggest you use a Guid
instance. You would have to generate new Guids at a rate of something like
5000/second for the next billion years or something ridiculous like that
before you actually create a duplicate.
 
Frankie said:
It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.
Nope.

The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSuppliedState) method... with
userSuppliedState being, more or less, a taskId.

How are you using it as an ID? Can you provide a more concrete example?
In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and yes,
I understand that System.Random doesn't provide *truly* random values).

It can be just about anything. Typically, it would be a class that
stores context important to the task instance.
Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?

No, it would be awful. Random numbers aren't guaranteed to be unique.

If each task requires some sort of unique context, why not just create a
class that can contain this context, store the data related to the
context in the class, and use a reference to the class a your "unique
value"?

If you really just need an integer, why not just use a sequential
number? If you could potentially create 4 billions tasks over time,
you'll have to check newly generated numbers to make sure they aren't in
use, but that would be a requirement if you're using Random anyway,
since those aren't guaranteed to be unique.

Pete
 
<snip>
Re:
<< If each task requires some sort of unique context,..."

I'm not sure what you mean by "context" here... what I'm referring to is
that the task just needs a unique _identifier_. What I'm doing is
implementing the Event-based async pattern. So in this case I have an async
operation - for which there can be multiple concurrent operations going on.
For example, it cold be a method named GrabFileAsync that retrieves a file
from some remote location. The client could request 15 different files - so
we'd have 15 GrabFileAsync() calls - with potentially all 15 of them running
concurrently. Each of these 15 concurrent operations needs to be uniquely
identified. The client, upon calling GrabFileAsync() would then supply a
unique identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.

It's entirely possible that your meaning "context" is the similar to mine.
If so, maybe you could clarify why an integer isn't the best, and why going
with some "context" class would be better. If not, are you thinking
SynchronizationContext or something like that? If so, then you'd be missing
the fact that the AsyncOperation - which is part of the Event-based async
pattern implementation I'm going with - basically encapsulates the
underlying SynchronizationContext.... so therefore no need for me to supply
SynchronizationContext.

-F
 
Frankie,
Typically what you would do here is pass your own custom StateObject class
instance as the UserSupplied State parameter. In the callback method you can
cast the received state parameter back to an instance of your StateObject,
which would have fields that identify the file name or whatever it is you're
doing.
In the example code for the AsyncOperationManager, the taskID is stored in a
Hybrid Dictionary. if you want to make it easy for it to be unique and you
don't really need any addtional state info to capture, just use a guid as was
mentioned.
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com
 
Frankie said:
I'm not sure what you mean by "context" here... what I'm referring to is
that the task just needs a unique _identifier_.

But what are you going to do with that identifier? For example, if all
you're going to do is use it to look up some data specific to the
specific instance of the operation, then why not just use the data
itself as the unique identifier?
[...]
Each of these 15 concurrent operations needs to be uniquely
identified.

Of course.
The client, upon calling GrabFileAsync() would then supply a
unique identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.

But what does "identifying" which operation has completed gain you? The
identification does you no good unless you somehow correlated the
identification with some data specific to the operation. So you might
as well use the data itself as your unique identifier.
It's entirely possible that your meaning "context" is the similar to mine.
If so, maybe you could clarify why an integer isn't the best, and why going
with some "context" class would be better.

See above. All that adding an integer into the design does is create an
extra level of indirection you need to resolve upon completion of an
operation. I don't see the point in doing that.

As an example, consider the Socket class. When you call BeginReceive,
you pass a "state" parameter. This is what I'm calling "context". In
the most basic case, the code using the Socket instance would typically
pass at a minimum the Socket instance reference itself. Then in the
receive callback, the "state" parameter passed to the callback can be
cast back to a Socket which can then be used to complete the operation
(calling EndReceive(), for example).

If there were other data related to the receive operation that was
important (for example, perhaps the Socket is being used to transfer a
file and you want to easily get the FileStream you're using to save the
data to the disk), then you'd have a class that contains both the Socket
instance reference as well as that other data (for example, the
FileStream reference). Then in the receive callback method, you just
cast the "state" parameter back to your particular class, and from that
retrieve the Socket instance and other data (such as the FileStream
instance reference).

If you instead use a unique integer, then instead of just casting the
value to the appropriate class, you have to use that integer to look up
an instance of the appropriate class in some data structure, like an
Array or Dictionary<>. Why add that extra bit of work when you could
just pass the reference you want in the first place?

Pete
 
Thanks for the dialog on this.... to continue with it...

But what are you going to do with that identifier? For example, if all
you're going to do is use it to look up some data specific to the specific
instance of the operation, then why not just use the data itself as the
unique identifier?

The identifier wouldn't necessarily be used to look up data [related to the
operation]. The identifier would identify the specific instance of the
operation, itself. There's no reason that we _must_ associate any data with
an async operation... just have it go to work. I would agree with the
position that states we would _usually_ have data to associate with an async
operation, and therefore could/should use that to ID the operation as you
are suggesting.

[...]
Each of these 15 concurrent operations needs to be uniquely identified.

Of course.
The client, upon calling GrabFileAsync() would then supply a unique
identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.

But what does "identifying" which operation has completed gain you? The
identification does you no good unless you somehow correlated the
identification with some data specific to the operation. So you might as
well use the data itself as your unique identifier.

Yes, but that's assuming that data must or can be associated with an async
operation prior to initiating it. Can't we have async operations without
associated data? We can certainly have methods that return void and take
zero parameters. They just "do something". What's to say we can't call such
a method asynchronously? At the bottom of this post I have a more detailed
example. The information we might want from such an asynchronous call might
include things about the call, itself, which might not be available before
calling the method. Such data about the call, itself, might include things
like "user cancelled the operation before it completed", "operation ran into
Exception xyz during the course of its operations", or "the operation just
now completed." In these cases, we'd have to invent some identifier out of
thin air - perhaps a unique integer, which is what I was thinking in the OP
here. Maybe I'm still wrong about that.

See above. All that adding an integer into the design does is create an
extra level of indirection you need to resolve upon completion of an
operation. I don't see the point in doing that.

Your "unnecessary indirection" point is well taken -but only in cases were
we would have data to associate with the operation prior to kicking it off.
If no data exists prior to initiating the async operation, then we'd have to
come up with _some_ way to ID the operation.

As an example, consider the Socket class. When you call BeginReceive, you
pass a "state" parameter. This is what I'm calling "context". In the
most basic case, the code using the Socket instance would typically pass
at a minimum the Socket instance reference itself. Then in the receive
callback, the "state" parameter passed to the callback can be cast back to
a Socket which can then be used to complete the operation (calling
EndReceive(), for example).

If there were other data related to the receive operation that was
important (for example, perhaps the Socket is being used to transfer a
file and you want to easily get the FileStream you're using to save the
data to the disk), then you'd have a class that contains both the Socket
instance reference as well as that other data (for example, the FileStream
reference). Then in the receive callback method, you just cast the
"state" parameter back to your particular class, and from that retrieve
the Socket instance and other data (such as the FileStream instance
reference).

Great example... couldn't agree more! But this is an example where we
actually have something to run with before kicking off the async operation.
If you instead use a unique integer, then instead of just casting the
value to the appropriate class, you have to use that integer to look up an
instance of the appropriate class in some data structure, like an Array or
Dictionary<>. Why add that extra bit of work when you could just pass the
reference you want in the first place?


Now, I can see this next question "coming down mainstreet" ---> Okay
Frankie, what is an example of an event for which we wouldn't have at least
_some_ data with which to uniquely ID the asycn operation prior to kicking
it off? Here goes...

I'm writing a utility app that will be used to update a bunch of Web sites
by copying files (.aspx, gif, etc) to various site directories. The utility
additionally updates the underlying SQL Server database by (1) executing DDL
scripts and (2) installing or updating stored procedures. Prior to launching
an update operation, the utility "validates" the destination Web sites and
SQL Server databases to be updated. Specifically, the Validate method
ensures that (1) each Web site's root directory exist, and that all required
subdirectories exist. A separate Validate method verifies connectivity to
the SQL Server databases. The utility does practically none of this grunt
work, itself. Rather, it dynamically loads "installers" which are classes
that implement a common IInstaller interface, which defines a Validate
method. The client, here, doesn't know what is being validated. All it's
doing is looping through its list of IInstallers and telling each to
Validate its environment. I'm modifying this arrangement so that the
Validate methods can run asynchronously (i.e., so the interface will be
modified to define ValidateAsync() in addition to the synchronous Validate()
method). All the client app will do is kick off these ValidateAsync
operations which will in turn report (1) progress (i.e., "SomeWebSite.com
validated successfully", or "failed to connect to TheDbNamedX" etc - for
each Web site and for each db), and (2) report when each ValidateAsync
operation completes (including the usual AsycnCompletedEventArgs stuff). In
this scenario I'm not sure how the client app that initiates these
ValidateAsync operations would identify each async operation without
assigning some unique ID invented out of thin air.

I'd appreciate your further perspective on this.

-Frankie
 
Frankie said:
The identifier wouldn't necessarily be used to look up data [related to the
operation].
Okay.

The identifier would identify the specific instance of the
operation, itself.

For what purpose do you need to identify the specific instance of the
operation? If you are not going to use the identifier as a way of
correlating the operation with some other data, why do you need it at all?
There's no reason that we _must_ associate any data with
an async operation... just have it go to work. I would agree with the
position that states we would _usually_ have data to associate with an async
operation, and therefore could/should use that to ID the operation as you
are suggesting.

You are right that there's no reason you must associate any data with an
async operation. But if there's no data associated with it, what need
for an identifier exists?
Yes, but that's assuming that data must or can be associated with an async
operation prior to initiating it.

Or as part of initiating it, anyway.
Can't we have async operations without
associated data?

Sure, no reason you can't have that.
We can certainly have methods that return void and take
zero parameters. They just "do something". What's to say we can't call such
a method asynchronously?

Nothing at all. But if that's all you have, why do you need an identifier?
At the bottom of this post I have a more detailed
example. The information we might want from such an asynchronous call might
include things about the call, itself, which might not be available before
calling the method. Such data about the call, itself, might include things
like "user cancelled the operation before it completed", "operation ran into
Exception xyz during the course of its operations", or "the operation just
now completed." In these cases, we'd have to invent some identifier out of
thin air - perhaps a unique integer, which is what I was thinking in the OP
here. Maybe I'm still wrong about that.

My previous comments have related primarily to data specific to the
client of the task. You have mentioned above things which may be
specific to the class implementing the task, and I agree that's not data
that the client should have to instantiate.

But that doesn't mean that the data isn't instantiated somewhere.

If what you want is a way for the client to retrieve information about
the task, it seems to me that the model there would be to have the task
itself represented by a class, returned by the method that starts the
asynchronous task. That class could implement IAsyncResult if you like,
but of course there's no _requirement_ that async operations follow the
..NET IAsyncResult pattern.
[...]
All the client app will do is kick off these ValidateAsync
operations which will in turn report (1) progress (i.e., "SomeWebSite.com
validated successfully", or "failed to connect to TheDbNamedX" etc - for
each Web site and for each db), and (2) report when each ValidateAsync
operation completes (including the usual AsycnCompletedEventArgs stuff). In
this scenario I'm not sure how the client app that initiates these
ValidateAsync operations would identify each async operation without
assigning some unique ID invented out of thin air.

Well, that brings me back to my previous question. What would you use
the ID for? Why does the client need to identify each async operation?
And why can that identification not be accomplished via an instance of
some class that specifically refers to the operation in some way?

The async patterns in .NET I'm most familiar with are the Begin/End
pattern that involves IAsyncResult, and the BackgroundWorker class that
uses events. It's not entirely clear from your post which you are
proposing, but the above suggests you are doing an event-based design.

If you have a need for the client to retrieve information about the task
before it's completed, then I would say having the async method
returning a class that provides for this (in the form of properties, for
example) would be appropriate. This class could also be included by
reference in your EventArgs class, so that the event handler also had
easy access to the task information.

Still, in the end you don't need an ID. You need some kind of reference
to the data you are trying to get.

If you feel that there's a use for an ID for a purpose other than
retrieving data (including state information), then I'd say you still
haven't been clear about how that would work, or what purpose that would
be. I'm a little tired at this point, so maybe I'm missing some obvious
point. But whatever the reason, I seem to need a more basic,
fundamental description in order for me to understand what the need for
a numeric ID is.

Pete
 
Point of clarification: I'm going with the Event-based async pattern... not
the IAsyncResult pattern.

Now to continue the discussion ...


<snip> <snip> <snip>
Re:
<<For what purpose do you need to identify the specific instance of the
operation?>>
<< But if there's no data associated with it, what need for an identifier
exists >>
<< Nothing at all. But if that's all you have, why do you need an
identifier?>>
and
<< Well, that brings me back to my previous question. What would you use
the ID for? Why does the client need to identify each async operation?

Simple answer is that the client may want to cancel the async operation
before it completes - or perhaps pause the operation and resume later.

RE:
<< And why can that identification not be accomplished via an instance of
some class that specifically refers to the operation in some way? >>

It _could_ be. But in my case I don't already have a class that refers to
the async operation. In my case, the client is initiating the async
operation by calling the ValidateAsync method of an interface - so the
client doesn't know anything about the particular class implementing the
async operation (other than that it has a ValidateAsync method that, in turn
triggers the async operation). In fact, the client isn't doing the grunt
work of kicking off the async operation. The client is telling the class
that provides for the Event-based async pattern implementation to start the
async operation. It is that event publisher/worker class that ultimately
calls BeginInvoke on a delegate representing the worker method - not the
client. All my client knows is that it has a few IInstaller instances - each
of which implements the ValidateAsync method. ValidateAsync takes a
parameter that can be any object. That parameter is subsequently used in the
"event publisher/worker class" to identify the async operation. I was
thinking this parameter - originating in the client- could be a unique
integer being that there really was no class to or other obvious way for the
client to identify the particular async operation.



Re:
<< But that doesn't mean that the data isn't instantiated somewhere >>

I'd agree that some data is _likely_ instantiated somewhere (in client or in
the asycn operation itself), but I don't see that as a requirement of
anything (and I'm suspect you don't either), and in my case my client simply
doesn't have it. If my client had it, then I'd consider it a "no brainer" to
use it in the manner you are suggesting. In the example of the real-world
utility I'm writing (described in my previous post... don't want to rehash
it here), the ValidateAsync methods do some activity, like checking to see
if a bunch of NTFS folders exist, that SQL server databases are accessible,
etc. There's really no data involved beyond status messages communicated
back to the client via events (e.g., "The Sql Server DB Named XYZ is
accessible"). That's data generated _by_ the async operation, but not data
_about_ the async operation itself.

Re:
If what you want is a way for the client to retrieve information about the
task,

I actually don't want for the client to retrieve information about the task.
The only things [that I can think of right now] that the client would need
to do after initiating the async operation is to (1) allow for the async
operation to be cancelled from the client, and (2) receive data from the
async operation via events raised from within the async operation or its
publishing class instance. Perhaps the client keeps a simple count of the
number of outstanding async operations. As each one completes, it
communicates completion to the client via the firing of a
MethodNameAsyncCompleted event. The client can then remove the operation -
as identified by the arbitrary ID (possibly integer) communicated in the
EventArgs - from its list of outstanding async operations. When the count
gets to zero, then the client can proceed to do whatever (enable a button
that can't be enabled until all pending async operations completed - yet let
the user do other stuff in the UI).

....
it seems to me that the model there would be to have the task itself
represented by a class, returned by the method that starts the
asynchronous task. That class could implement IAsyncResult if you like,
but of course there's no _requirement_ that async operations follow the
.NET IAsyncResult pattern.


I hadn't thought of doing that. Thanks. I haven't yet learned the
IAsyncResult pattern, so maybe this idea of yours would come in quite handy
there. I can see where it might be useful with the Event-based async pattern
too... just not in the case that sparket the OP here.


RE:
<< I seem to need a more basic, fundamental description in order for me to
understand what the need for a numeric ID is. >>

Hopefully I accomplished that in my ramblings above. To put it plain - I
have a situation where the client initiating the async operations actually
knows nothing about the async operation itself. The client is simply calling
a method of an interface (ValidateAsync). That method is implemented in
another "worker bee class" that, itself, does the grunt work of launching
the async operation (via BeginInvoke etc). The ValidateAync method returns
immediately (fire and forget). That "worker bee class" subsequently
communicates back to the client _only_ by raising events. The client then
only handles to those events.

-Frankie
 
Frankie said:
[...]
<< And why can that identification not be accomplished via an instance of
some class that specifically refers to the operation in some way? >>

It _could_ be. But in my case I don't already have a class that refers to
the async operation.

If you have no class that refers to the async operation, then how would
you use a numeric ID to map to an async operation?

Surely the async operation has _some_ data somewhere. Otherwise, you
have no way to reference it.
In my case, the client is initiating the async
operation by calling the ValidateAsync method of an interface - so the
client doesn't know anything about the particular class implementing the
async operation

So what? I never said there's an existing class that the client knows
about. We are (as far as I know) talking about how the design _could_
be, not how it is.

So, just because there's no class the client knows about now, that
doesn't mean there couldn't be one. Just return the reference to the
class implementing the async operation.

It doesn't need to be the actual type known to the code that actually
uses it. Publish some sort of base class or interface that you can
return; all the client would know is "this is my unique reference to the
async operation". Then the actual reference would be a class that
inherits or implements the base class or interface, respectively.
[...]
ValidateAsync takes a
parameter that can be any object. That parameter is subsequently used in the
"event publisher/worker class" to identify the async operation.

Why is the worker class using data from the client to identify its own
data? What happens if the client uses the same value twice? Is there
any check on the parameter to ensure that it is in fact unique?
I was
thinking this parameter - originating in the client- could be a unique
integer being that there really was no class to or other obvious way for the
client to identify the particular async operation.

Re:
<< But that doesn't mean that the data isn't instantiated somewhere >>

I'd agree that some data is _likely_ instantiated somewhere (in client or in
the asycn operation itself), but I don't see that as a requirement of
anything (and I'm suspect you don't either), and in my case my client simply
doesn't have it.

I don't understand why you keep mentioning what the client doesn't have
(see each of the above two paragraphs). Are we not talking about a
component that you are in the process of designing? What does it matter
what the client has now? How does that restrict what you are able to do
in your design?

Just because you're not returning something useful to the client now, I
don't see why that means you cannot do so in a new version of your design.

It seems to me that one of the reasons my point wasn't getting across in
a previous message is that it seems that the unique ID you're looking
for isn't in fact used by the async processing, but rather only by the
client for managing some list of outstanding tasks. IMHO, that
overlooks the statement you made about wanting to be able to cancel the
async operation, but I can't figure out any other reason that what I've
written isn't clear enough.

So, let's take the two examples you've mentioned most recently of things
you might do with the unique ID:

1) Manage a client-side data structure (eg list) of outstanding tasks

2) Allow for canceling of outstanding tasks

Now, in the #1 scenario, I agree it doesn't matter what the value is as
long as it's unique. Though, if you haven't associated the value with
any actual data, it begs the question as to why use a value at all. As
you've already pointed out, you could just keep a counter.

In the #2 scenario, however...I find it obvious that there must be
_some_ data somewhere associated with that ID. If one assumes that the
client will pass that ID to the async implementation, and that the async
implementation will be able to use that ID to identify a task to be
canceled, then there _must_ be some mapping from the ID to some data
representing the task.

So, instead of having the client pass the ID and managing some sort of
dictionary mapping it to the data representing the task, why not just
pass a reference to that data back to the client? The client need not
know what's actually in it (see above regarding a simple base class or
interface); it just needs to hang on to it in case it wants to refer to
the specific task in the future.

I'm afraid I have, at least for the moment, run out of different ways to
state the above. It all comes down to the fact that the ID is
apparently supposed to represent some "thing" and in .NET all "things"
are ultimately representable by references, so IMHO you might as well
use that reference rather than some arbitrary ID that maps to that
reference.

Note that the key here is that there seems to be a one-to-one
relationship between this ID and some "thing". There are of course
situations in which you need a unique numeric ID that maps to something
that either doesn't exist yet, or you need the ID to be constant across
multiple executions of your program, or any number of other situations
in which unique numeric ID's unrelated to the object references are
needed. But so far, nothing about what you've described suggests that
any of those possible situations apply here.

Pet
 
If, after reading my latest replies below, things aren't making sense, then
it might be useful for you to have a quick at the following links. I'm
making this recommendation *not* as any dismissive step, but it provides a
perfectly clear example of the sort of implementation I'm modeling.

The first is the component that kicks off and maintains the async
operations, themselves
http://msdn2.microsoft.com/en-us/library/9hk12d4y.aspx

This is to the client of the above component:
http://msdn2.microsoft.com/en-us/library/8wy069k1.aspx


Please note that I'm not blindly following some MSDN example and taking it
as gospel. You can see that in the client they user Random to generate the
unique identifier for each async operation. My OP here questioned that
practice.

Furthermore, I believe I understand your points... I just don't see how they
apply to the Event-based async pattern - which is implemented in the sample
code at the above links - or even how your recommendations would improve on
the implementation going on in the sample code at the above links. I'm open
to learning about that improvement if it is to be had.

Unless I'm mistaken, the only reasonable way to get away from the client
generating the unique ID for the async operation - in the implementation at
the above links - would be for the CalculatePrimeAsync method (which
currently returns void) to return an IAsyncResult to the client. But that
would break the model being promoted here which hides IAsyncResult, etc and
other async operation implementation details from the client.

Now, for the inline responses:


Peter Duniho said:
Frankie said:
[...]
<< And why can that identification not be accomplished via an instance of
some class that specifically refers to the operation in some way? >>

It _could_ be. But in my case I don't already have a class that refers to
the async operation.

If you have no class that refers to the async operation, then how would
you use a numeric ID to map to an async operation?

The client doesn't need to map to the async operation, itself. The client
has a reference to another class that, in turn can kick off multiple
concurrent "copies" (for lack of a better term) of the async operation. In
fact, in the Event-based async pattern the client isn't supposed to maintain
the async operation, AFAIKT. It's the "event publisher" in that pattern,
that maintains all the knowledge about the async pattern. The client needs
to be able to instruct the publisher about specific async operations,
though, thus the need for the client to have and even control the
identification of each async operation.
Surely the async operation has _some_ data somewhere. Otherwise, you have
no way to reference it.


So what? I never said there's an existing class that the client knows
about. We are (as far as I know) talking about how the design _could_ be,
not how it is.

Yes - of course. Being that I can design just about anything I want here,
I'd like to get something I won't later regret...

So, just because there's no class the client knows about now, that doesn't
mean there couldn't be one. Just return the reference to the class
implementing the async operation.

I'm sure that's sound advice for async models other than the Event-based, in
which the client isn't directly responsible for maintaining much, if any,
direct knowledge of async operations.
It doesn't need to be the actual type known to the code that actually uses
it. Publish some sort of base class or interface that you can return; all
the client would know is "this is my unique reference to the async
operation". Then the actual reference would be a class that inherits or
implements the base class or interface, respectively.
[...]
ValidateAsync takes a parameter that can be any object. That parameter
is subsequently used in the "event publisher/worker class" to identify
the async operation.

Why is the worker class using data from the client to identify its own
data? What happens if the client uses the same value twice? Is there any
check on the parameter to ensure that it is in fact unique?

Yes - there is a check in the worker class.
I don't understand why you keep mentioning what the client doesn't have
(see each of the above two paragraphs). Are we not talking about a
component that you are in the process of designing? What does it matter
what the client has now? How does that restrict what you are able to do
in your design?

It doesn't. I was just trying to convey what I was actually doing.
Nevertheless, the fact that I could do anything I want doesn't mean that I
should add more complexity to the Event-based async pattern.
Just because you're not returning something useful to the client now, I
don't see why that means you cannot do so in a new version of your design.

It seems to me that one of the reasons my point wasn't getting across in a
previous message is that it seems that the unique ID you're looking for
isn't in fact used by the async processing, but rather only by the client
for managing some list of outstanding tasks.
Exactly!

IMHO, that overlooks the statement you made about wanting to be able to
cancel the async operation,

No, it doesn't. The client instructs the worker class - here go ahead and
cancel async operation #3. The worker class then goes and cancels it.
There's nothing about that scenario that requires the client to have any
more knowledge about the async operation than some arbitrary unique
identifier.

but I can't figure out any other reason that what I've written isn't clear
enough.

So, let's take the two examples you've mentioned most recently of things
you might do with the unique ID:

1) Manage a client-side data structure (eg list) of outstanding tasks

2) Allow for canceling of outstanding tasks

Now, in the #1 scenario, I agree it doesn't matter what the value is as
long as it's unique. Though, if you haven't associated the value with any
actual data, it begs the question as to why use a value at all. As you've
already pointed out, you could just keep a counter.

In the #2 scenario, however...I find it obvious that there must be _some_
data somewhere associated with that ID.

Yes - that data _somewhere_ is in the worker class. But the client doesn't
know anything about it. It could, but that would be unnecessary, AFAIKT.
If one assumes that the client will pass that ID to the async
implementation, and that the async implementation will be able to use that
ID to identify a task to be canceled, then there _must_ be some mapping
from the ID to some data representing the task.

Yes - and that mapping takes place in the worker class. Remember, the worker
class can have N number of concurrent copies any particular async operation
going on simultaneously.

So even though the client has a reference to the worker class, that is not
enough for the client to go on to identify any particular copy of the N
potential async operations going on in the worker.
So, instead of having the client pass the ID and managing some sort of
dictionary mapping it to the data representing the task, why not just pass
a reference to that data back to the client? The client need not know
what's actually in it (see above regarding a simple base class or
interface); it just needs to hang on to it in case it wants to refer to
the specific task in the future.

I'm afraid I have, at least for the moment, run out of different ways to
state the above. It all comes down to the fact that the ID is apparently
supposed to represent some "thing" and in .NET all "things" are ultimately
representable by references, so IMHO you might as well use that reference
rather than some arbitrary ID that maps to that reference.

The problem with that is -- the reference that the client has to the worker
in no way helps the client to identify any particular async operation going
on in the worker. Keep in mind that by "worker" I'm meaning the class that
is responsible for initiating the async operations- possibly N copies of the
same operation, depending on how many are requested by the client. By
"worker" I'm not meaning the actual method that is executed asynchronously
on a background thread.

Note that the key here is that there seems to be a one-to-one relationship
between this ID and some "thing". There are of course situations in which
you need a unique numeric ID that maps to something that either doesn't
exist yet, or you need the ID to be constant across multiple executions of
your program, or any number of other situations in which unique numeric
ID's unrelated to the object references are needed. But so far, nothing
about what you've described suggests that any of those possible situations
apply here.

I'll try again, then:

Client ----> AsyncOpPublisher ----> AsyncOperation1
-----> AsyncOperation2
-----> AsyncOperation3
-----> AsyncOperationN

The client requests the publisher to initiate AsyncOperation
Publisher then gets AsyncOperation1 going via BeginInvoke()

I suppose that at this point, the publisher *could* return an IAsyncResult
to the client.

But that's not what's called for in the Event-based async pattern, which
calls for the AsyncOpPublisher to *hide* all of that IAsyncResult stuff from
client.

So, client doesn't simply tell AsyncOpPublisher to initiate an
AsyncOpeartion. In the request to initiate the op, the client also supplies
a unique identifier. The publisher then starts the async op (BeginInvokes
it), and then returns *nothing* to the client. The AsyncOpPublisher then
associates the client-supplied unique identifier with the async op for the
lifetime of the async op. As the async op proceeds, the AsyncOpPublisher
will raise events in order to communicate status, etc back to the client
about the progress or final results of async op. As those events are raised,
AsyncOpPublisher includes the client-supplied unique identifier so that the
client can know which asyhnc op is being reported on.

-Frankie
 
Frankie said:
[...]
Unless I'm mistaken, the only reasonable way to get away from the client
generating the unique ID for the async operation - in the implementation at
the above links - would be for the CalculatePrimeAsync method (which
currently returns void) to return an IAsyncResult to the client. But that
would break the model being promoted here which hides IAsyncResult, etc and
other async operation implementation details from the client.

Okay...thanks for the links. I now understand better what design model
you're trying to follow.

First, I will point out that in the sample, it is very clear what the
answer to my repeated question regarding how the numeric ID is used. It
is used to retrieve an AsyncOperation from a HybridDictionary instance.
So the direct implication with respect to my previous comments is that
it's the AsyncOperation that you should pass back to the client, somehow
(note that you need not pass the client something it recognizes as an
AsyncOperation, or even something from which it can easily get the
AsyncOperation...it just needs to be something that immediately can be
translated into an AsyncOperation).

If you want to write code that generates unique numeric IDs, fine.
IMHO, GUIDs are overkill and random numbers aren't going to work at all
(since they aren't guaranteed to be unique). But otherwise, you
certainly could to that. The sample code you posted uses GUIDs, but
sequential numbers would work just fine (just be prepared to catch the
ArgumentException for non-unique numbers and try the next one, in the
unlikely even that you wrap around the full range of the 32-bit or
64-bit numeric variable you're using).

But, a couple of points that I hope will make clear what I'm trying to say:

1) It is not true that "the only reasonable way to get away from
the client generating the unique ID for the async operation...would be
for the CalculatePrimeAsync method...to return an IAsyncResult". The
method can return anything you want it to. It need not be an
IAsyncResult, and in fact should not be unless you really want a
mixed-mode event-plus-IAsyncResult design.

Using the sample code you're referring to as a template, let's look at
what modification I would make that would meet the goals I've already
stated. Rather than having a void return value, I would have
CalculatePrimeAsync return an instance of a class that looks something
like this:

class PrimeAsyncOperation : IPrimeAsyncOperation
{
AsyncOperation _ao;

public AsyncOperation AsyncOperation
{
get { return _ao; }
}

public PrimeAsyncOperation(AsyncOperation ao)
{
_ao = ao;
}
}

where:

public interface IPrimeAsyncOperation
{
}

The PrimeAsyncOperation class itself need not be visible to the client;
only the interface IPrimeAsyncOperation needs to be, and that's what the
CalculatePrimeAsync method would return.

Or rather, it's nice to have it that way; you could actually just return
an Object and forget about the interface altogether, but having a type
allows you to ensure in your client code that you maintain objects of
the right type. The interface isn't really required here; I just feel
it makes the code a little nicer.

Anyway, then any time the client wants to refer to the async operation,
rather than passing an ID which then needs to be used to look up the
AsyncOperation, all that the worker class has to do is retrieve the
AsyncOperation from the class.

One final note: the above still has one level of indirection. It's MUCH
more efficient than messing around with a hash table a la
HybridDictionary, but it's still a level of indirection. It's required
if you want to pass back a typed interface because AsyncOperation is
sealed and you don't have a way of instantiating the AsyncOperation
class except through the factory AsyncOperationManager (either situation
would force the issue, and you have both here)

But, if you are okay with passing back a plain old Object, then this
extra level of indirection can just go away. You'd just pass the
AsyncOperation instance itself back, as an Object. The client wouldn't
know anything about it except that it would use it as the way of
uniquely identifying the operation.

Alternatively, in a situation where the relevant class isn't a sealed
class with a factory for instantiation, you could create a new class
that inherits the actual identifying class, and have that new class
implement the empty interface that is public to the client. Then you
can have a typed reference passed back to the client, but which still
doesn't expose the internal aspects of the async implementation.

Note also that there's no requirement that you use AsyncOperation to
manage your asynchronous tasks. You could easily use some other
mechanism that does allow for inheriting the state object yourself.

2) IMHO, there is nothing invalid about having an event-based
design that returns something from the async method. Just because that
one MSDN sample doesn't, that doesn't mean it's prohibited. You can
design your class however you feel will work best for you. Heck, even
if you did want to return an IAsyncResult, you could, though it's true
that implies a certain level of non-event-based-ness that may not be
desirable.

The fact is, IMHO the sample code you're referring to is not a very nice
implementation of an event-based async design. I personally don't like
the aspect that a single class is responsible for managing multiple
asynchronous tasks. I prefer instead a design similar to the
BackgroundWorker where you instantiate a single class for each instance
of an asynchronous operation. Then, the class itself is all you need to
reference the asynchronous operation.

But, assuming you want a single class to manage multiple operations,
there's still no requirement that the client be the one to provide the
unique identifier, and IMHO it is much more natural and efficient to
allow the worker class managing the operations to provide the unique
identifier.

And I think that's all I'm going to say about that. :)

Pete
 
<snip all>

Thanks for the helpful dialog on this. I learned a lot as this was my
initial encounter with the Event-based async pattern and the MSDN sample is
what I was basing everything off of. And thanks especially for posting some
very clear feedback and perspective on how you would modify the MSDN sample
in a way that frees the client from having to come up with some arbitrary
ID. I didn't really like it to begin with, and especially their use of
Random... thus the OP here.

-Framkie
 
Frankie said:
<snip all>

Thanks for the helpful dialog on this. I learned a lot as this was my
initial encounter with the Event-based async pattern and the MSDN sample is
what I was basing everything off of. And thanks especially for posting some
very clear feedback and perspective on how you would modify the MSDN sample
in a way that frees the client from having to come up with some arbitrary
ID.

You're welcome. As you can see, it is often much easier to offer
comments when there is a concrete example of code to start with
(especially if that code sample is relatively minimal).
I didn't really like it to begin with, and especially their use of
Random... thus the OP here.

Indeed. I hope if nothing else it's clear that using Random isn't
appropriate for generating task IDs. Note, however, that the use of
Random in the sample code isn't for the task ID, but rather for the
number to test for prime-ness. The sample code uses a GUID as the task ID.

So, while there are obviously aspects of the sample I would do
differently, it's not actually broken. :)

Pete
 
Back
Top