Parallel.For to retrieve DataTables

S

SetonSoftware

I'm pulling a series of DataTables to feed to a report. Each of the
stored procedures will require a few seconds to run so to speed up
response time I'm trying to use VS2010's parallel programming feature.
Here's the code I have so far:

List<DataTable> crDataList = new List<DataTable>();

Parallel.For(0, 2, i =>
{
GetData(crDataList, i);
});


private void GetData(List<DataTable> crDataList, int i)
{
crDataList.Add(GetDT(i));
}

GetDT() will execute a different stored proc depending on the value of
i passed. I want to fill the generic List crDataList with the
DataTables returned from the different stored rpocs. I'm only getting
the first one. What am I missing here?

Thanks

Carl
 
S

SetonSoftware

Impossible to say without a concise-but-complete code example that
reliably reproduces the problem.  But, since you have done nothing to
synchronize access to the List<DataTable> object, certainly one possible
outcome is for only one of the adds to the list to actually be retained.
  You're lucky that the data structure doesn't wind up corrupted and
unusable instead of simply producing deficient results.

There are a variety of ways you can address the synchronization issue.
One simple change would be to just synchronize the access to the list:

   void Method()
   {
     List<DataTable> crDataList = new List<DataTable>();
     object objLock = new object();

     Parallel.For(0, 2, i =>
     {
       DataTable table = GetDT(i);

       lock (objLock)
       {
         crDataList.Add(table);
       }
     });
   }

A slightly more complicated, but more flexible (and efficient), approach
would be to use the Parallel.For() overload that allows for per-thread
loop initialization and conclusion:

   void Method()
   {
     List<DataTable> crDataList = new List<DataTable>();
     object objLock = new object();

     Parallel.For(0, 2, () => new List<DataTable>,
     (i, state, local) =>
     {
       local.Add(GetDT(i));
     },
     local =>
     {
       lock (objLock)
       {
         crDataList.AddRange(local);
       }
     });
   }

That version allows each thread to operate independently on its own
local copy of a list, synchronizing only at the end to consolidate the
lists.  This minimizes contention, which for a process involving a lot
more iterations and threads would be important (for the example you
posted, not so much).

There are other approaches one might take, including not using the
System.Threading.Tasks namespace at all (i.e. using some of the other
threading features in .NET), or using an instance of the Task class to
represent each iteration of the loop.  But I think either of the two
above examples should work fine.

Of course, all of this assumes that your database server is sufficiently
parallelized and has the bandwidth to deal with as many simultaneous
queries as your actual code will eventually make.  Otherwise, you'll
just wind up bottlenecked somewhere else and the client-side concurrent
code is a waste.  :)

Pete- Hide quoted text -

- Show quoted text -

This is fantastic. Thanks. I'm working on a duo-core machine and the
SQL Server is multi-processor so we're now seeing the results returned
in 1/3 the time. I do love parallel processing.

Thanks

Carl
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top