PC Review


Reply
Thread Tools Rate Thread

Defining a block of consecutive rows in a datatable

 
 
prodata
Guest
Posts: n/a
 
      11th Apr 2010
I've got a large set of time-series data which is organised _strictly_
in time sequence and contained in a (VB2005) datatable.

I want to iterate through this data to extract various hourly means,
which means identifying blocks of consecutive rows and computing
hourly aggregate functions (mean, SD etc).

I do have this working but by progressively working through the table
from top to bottom, identifying a block of rows with the same hour
value, copying this block of rows to a separate scratchpad datatable
and using various datatable.compute aggregate functions on all of the
rows in the scratchpad table.

I'm sure this isn't the most efficient approach in that it would be
better to work on the full original datatable and to pass a filter
expression to the datatable.compute command to indicate which
particular block of row numbers should be used in the calculation, but
being only an occasional ADO.Net user I don't know how to build the
filter.

What I want to say for the filter argument is something like 'Where
Rownumber Between R1 And R2" or 'Rownumber >=R1 AND Rownumber <=R2".
(I will know which row numbers constitute the relevant block and I
don't want to filter by datetime value - each hourly block would be
little more than 0.01% of an annual datatable, so not efficient to
repeatedly test the whole datatable just to select a particular hour's
worth of data.)

So I guess the question is whether there is any way of accessing the
row number property from a filter expression. (I guess I could
manually build in a row number column into the table and filter on
that, but maybe this is unnecesssary.)

Anyone able to help please?
 
Reply With Quote
 
 
 
 
Cor Ligthert[MVP]
Guest
Posts: n/a
 
      11th Apr 2010
Hi gproData,

Where did you get the idea that loops take a lot of time?

If you see some short code, then it often done behind the scene as well in a
loop.

But you can of course use Linq to Dataset, but don't assume it goes quicker.

(At least it uses some more code)

http://msdn.microsoft.com/en-us/vbasic/bb688086.aspx

Success

Cor

"prodata" <(E-Mail Removed)> wrote in message
news:418fa776-d981-414d-b457-(E-Mail Removed)...
> I've got a large set of time-series data which is organised _strictly_
> in time sequence and contained in a (VB2005) datatable.
>
> I want to iterate through this data to extract various hourly means,
> which means identifying blocks of consecutive rows and computing
> hourly aggregate functions (mean, SD etc).
>
> I do have this working but by progressively working through the table
> from top to bottom, identifying a block of rows with the same hour
> value, copying this block of rows to a separate scratchpad datatable
> and using various datatable.compute aggregate functions on all of the
> rows in the scratchpad table.
>
> I'm sure this isn't the most efficient approach in that it would be
> better to work on the full original datatable and to pass a filter
> expression to the datatable.compute command to indicate which
> particular block of row numbers should be used in the calculation, but
> being only an occasional ADO.Net user I don't know how to build the
> filter.
>
> What I want to say for the filter argument is something like 'Where
> Rownumber Between R1 And R2" or 'Rownumber >=R1 AND Rownumber <=R2".
> (I will know which row numbers constitute the relevant block and I
> don't want to filter by datetime value - each hourly block would be
> little more than 0.01% of an annual datatable, so not efficient to
> repeatedly test the whole datatable just to select a particular hour's
> worth of data.)
>
> So I guess the question is whether there is any way of accessing the
> row number property from a filter expression. (I guess I could
> manually build in a row number column into the table and filter on
> that, but maybe this is unnecesssary.)
>
> Anyone able to help please?


 
Reply With Quote
 
Cor Ligthert[MVP]
Guest
Posts: n/a
 
      11th Apr 2010
By the way a block of consecutive rows in a datatable don't exist, in fact
they are ordered in the ways they are added without looking at the content.

But you can use a DataView to get a view on the rows in a consecutive way by
using the sort and rowfilter properties.

"prodata" <(E-Mail Removed)> wrote in message
news:418fa776-d981-414d-b457-(E-Mail Removed)...
> I've got a large set of time-series data which is organised _strictly_
> in time sequence and contained in a (VB2005) datatable.
>
> I want to iterate through this data to extract various hourly means,
> which means identifying blocks of consecutive rows and computing
> hourly aggregate functions (mean, SD etc).
>
> I do have this working but by progressively working through the table
> from top to bottom, identifying a block of rows with the same hour
> value, copying this block of rows to a separate scratchpad datatable
> and using various datatable.compute aggregate functions on all of the
> rows in the scratchpad table.
>
> I'm sure this isn't the most efficient approach in that it would be
> better to work on the full original datatable and to pass a filter
> expression to the datatable.compute command to indicate which
> particular block of row numbers should be used in the calculation, but
> being only an occasional ADO.Net user I don't know how to build the
> filter.
>
> What I want to say for the filter argument is something like 'Where
> Rownumber Between R1 And R2" or 'Rownumber >=R1 AND Rownumber <=R2".
> (I will know which row numbers constitute the relevant block and I
> don't want to filter by datetime value - each hourly block would be
> little more than 0.01% of an annual datatable, so not efficient to
> repeatedly test the whole datatable just to select a particular hour's
> worth of data.)
>
> So I guess the question is whether there is any way of accessing the
> row number property from a filter expression. (I guess I could
> manually build in a row number column into the table and filter on
> that, but maybe this is unnecesssary.)
>
> Anyone able to help please?


 
Reply With Quote
 
prodata
Guest
Posts: n/a
 
      11th Apr 2010
On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
wrote:
> By the way a block of consecutive rows in a datatable don't exist, in fact
> they are ordered in the ways they are added without looking at the content.


Yes I understand what you're saying. But in fact I control the order
that the rows are added in when creating the initial datatable so - at
least assuming that this order doesn't change during the lifetime of
the datatable - I can be confident that when I imagine that a given
block of rows is in strict datetime order then it really is.

But perhaps I might take from your comment that what I'm seeking to do
actually represents a relatively unusual and artificial situation and
so maybe is not well provided for by ADO.Net functions. Maybe I would
be better adding a column to the datatable that represents integer
hours since the very first entry in the datatable (which would be
calculated as each individual datarow was added) and then simply
performing the aggregate functions on each valid hour value in turn
(ie using the hour value as the filter in the .compute command).
 
Reply With Quote
 
Mr. Arnold
Guest
Posts: n/a
 
      11th Apr 2010
prodata wrote:
> On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
> wrote:
>> By the way a block of consecutive rows in a datatable don't exist, in fact
>> they are ordered in the ways they are added without looking at the content.

>
> Yes I understand what you're saying. But in fact I control the order
> that the rows are added in when creating the initial datatable so - at
> least assuming that this order doesn't change during the lifetime of
> the datatable - I can be confident that when I imagine that a given
> block of rows is in strict datetime order then it really is.
>
> But perhaps I might take from your comment that what I'm seeking to do
> actually represents a relatively unusual and artificial situation and
> so maybe is not well provided for by ADO.Net functions. Maybe I would
> be better adding a column to the datatable that represents integer
> hours since the very first entry in the datatable (which would be
> calculated as each individual datarow was added) and then simply
> performing the aggregate functions on each valid hour value in turn
> (ie using the hour value as the filter in the .compute command).



Datatable? Why bother? Why are you not using a List<T> of accessor
objects? Why are you not using Linq with a where clause to query the
List<T> of objects, along with using the Linq aggregate functions?
 
Reply With Quote
 
Cor Ligthert[MVP]
Guest
Posts: n/a
 
      11th Apr 2010
Why not, looks for me without your current need something you could do.

Seems to me vital for the data you are storing, as far as I understand it
from your reply.


"prodata" <(E-Mail Removed)> wrote in message
news:df95b179-7f4d-4e51-9bcb-(E-Mail Removed)...
> On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
> wrote:
>> By the way a block of consecutive rows in a datatable don't exist, in
>> fact
>> they are ordered in the ways they are added without looking at the
>> content.

>
> Yes I understand what you're saying. But in fact I control the order
> that the rows are added in when creating the initial datatable so - at
> least assuming that this order doesn't change during the lifetime of
> the datatable - I can be confident that when I imagine that a given
> block of rows is in strict datetime order then it really is.
>
> But perhaps I might take from your comment that what I'm seeking to do
> actually represents a relatively unusual and artificial situation and
> so maybe is not well provided for by ADO.Net functions. Maybe I would
> be better adding a column to the datatable that represents integer
> hours since the very first entry in the datatable (which would be
> calculated as each individual datarow was added) and then simply
> performing the aggregate functions on each valid hour value in turn
> (ie using the hour value as the filter in the .compute command).


 
Reply With Quote
 
Cor Ligthert[MVP]
Guest
Posts: n/a
 
      11th Apr 2010
Why do you create an extra collections when you have already a Ilist
implementing collection of rows.

Are you afraid that you don't use all the memory in a computer.

A datatable is a very good IListSource implementing Type



"Mr. Arnold" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> prodata wrote:
>> On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
>> wrote:
>>> By the way a block of consecutive rows in a datatable don't exist, in
>>> fact
>>> they are ordered in the ways they are added without looking at the
>>> content.

>>
>> Yes I understand what you're saying. But in fact I control the order
>> that the rows are added in when creating the initial datatable so - at
>> least assuming that this order doesn't change during the lifetime of
>> the datatable - I can be confident that when I imagine that a given
>> block of rows is in strict datetime order then it really is.
>>
>> But perhaps I might take from your comment that what I'm seeking to do
>> actually represents a relatively unusual and artificial situation and
>> so maybe is not well provided for by ADO.Net functions. Maybe I would
>> be better adding a column to the datatable that represents integer
>> hours since the very first entry in the datatable (which would be
>> calculated as each individual datarow was added) and then simply
>> performing the aggregate functions on each valid hour value in turn
>> (ie using the hour value as the filter in the .compute command).

>
>
> Datatable? Why bother? Why are you not using a List<T> of accessor
> objects? Why are you not using Linq with a where clause to query the
> List<T> of objects, along with using the Linq aggregate functions?


 
Reply With Quote
 
Cor Ligthert[MVP]
Guest
Posts: n/a
 
      11th Apr 2010
Why Not? Looks for me, without knowing all your current needs, something you
could do.

"Cor Ligthert[MVP]" <(E-Mail Removed)> wrote in message
news:O#(E-Mail Removed)...
> Why not, looks for me without your current need something you could do.
>
> Seems to me vital for the data you are storing, as far as I understand it
> from your reply.
>
>
> "prodata" <(E-Mail Removed)> wrote in message
> news:df95b179-7f4d-4e51-9bcb-(E-Mail Removed)...
>> On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
>> wrote:
>>> By the way a block of consecutive rows in a datatable don't exist, in
>>> fact
>>> they are ordered in the ways they are added without looking at the
>>> content.

>>
>> Yes I understand what you're saying. But in fact I control the order
>> that the rows are added in when creating the initial datatable so - at
>> least assuming that this order doesn't change during the lifetime of
>> the datatable - I can be confident that when I imagine that a given
>> block of rows is in strict datetime order then it really is.
>>
>> But perhaps I might take from your comment that what I'm seeking to do
>> actually represents a relatively unusual and artificial situation and
>> so maybe is not well provided for by ADO.Net functions. Maybe I would
>> be better adding a column to the datatable that represents integer
>> hours since the very first entry in the datatable (which would be
>> calculated as each individual datarow was added) and then simply
>> performing the aggregate functions on each valid hour value in turn
>> (ie using the hour value as the filter in the .compute command).

>

 
Reply With Quote
 
Mr. Arnold
Guest
Posts: n/a
 
      11th Apr 2010
Cor Ligthert[MVP] wrote:
> Why do you create an extra collections when you have already a Ilist
> implementing collection of rows.


I moved away from datatables and datasets the moment I went to ADO.NET
Entity Framework and Linq-2-Objects.
>
> Are you afraid that you don't use all the memory in a computer.


Accessing a datatable has been proven to be a slow means of accessing
data, much slower to access than a collections of objects when it comes
to querying.
>
> A datatable is a very good IListSource implementing Type


I prefer the List<T> of objects with its ability to be quired using
Linq-2-Objects. Objects are much more flexible. I'll take objects every
time over datatable and dataset.


>
>
>
> "Mr. Arnold" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> prodata wrote:
>>> On 11 Apr, 14:48, "Cor Ligthert[MVP]" <Notmyfirstn...@planet.nl>
>>> wrote:
>>>> By the way a block of consecutive rows in a datatable don't exist,
>>>> in fact
>>>> they are ordered in the ways they are added without looking at the
>>>> content.
>>>
>>> Yes I understand what you're saying. But in fact I control the order
>>> that the rows are added in when creating the initial datatable so - at
>>> least assuming that this order doesn't change during the lifetime of
>>> the datatable - I can be confident that when I imagine that a given
>>> block of rows is in strict datetime order then it really is.
>>>
>>> But perhaps I might take from your comment that what I'm seeking to do
>>> actually represents a relatively unusual and artificial situation and
>>> so maybe is not well provided for by ADO.Net functions. Maybe I would
>>> be better adding a column to the datatable that represents integer
>>> hours since the very first entry in the datatable (which would be
>>> calculated as each individual datarow was added) and then simply
>>> performing the aggregate functions on each valid hour value in turn
>>> (ie using the hour value as the filter in the .compute command).

>>
>>
>> Datatable? Why bother? Why are you not using a List<T> of accessor
>> objects? Why are you not using Linq with a where clause to query the
>> List<T> of objects, along with using the Linq aggregate functions?

>

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Transposing data from consecutive rows into non-consecutive rows econ Microsoft Excel Misc 0 10th Mar 2008 07:24 PM
Inserting multiple rows in excel with data in consecutive rows =?Utf-8?B?dGVjaG5vdHJvbmlj?= Microsoft Excel Programming 2 20th Oct 2005 03:12 PM
Defining variables in a block? Microsoft VB .NET 11 29th Jul 2005 09:43 AM
How can I use real SQL on a DataTable? i.e. not array of rows using a filter... as in DataTable.Select Dan V. Microsoft C# .NET 3 1st Jul 2004 03:06 PM
defining a consecutive cells criteria daniel proville Microsoft Excel Worksheet Functions 1 24th Nov 2003 01:55 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 02:02 PM.