Grouping in Order

W

Willem

Hi there,

I have a table like this:

ID CLIENTNUM DATE TIME LATITUDE
LONGITUDE BOUNDARYSTAT

1 1 06/07/04 03:03:23 52.22233
120.33442 IN
2 1 06/07/04 03:15:45 52.55666
120.44555 IN
3 1 06/07/04 03:23:23 52.88333
120.88383 OUT
4 1 06/07/04 03:43:22 52.44332
120.34343 IN
5 1 07/07/04 07:03:23 52.22233
120.33442 IN
6 1 07/07/04 07:15:45 52.55666
120.77483 OUT
7 1 07/07/04 07:23:23 52.88333
120.88383 OUT
8 1 07/07/04 07:43:22 52.44332
120.34343 IN


I want to group it so that it looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:15:45
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 06/07/04 03:43:22 03:43:22
IN
1 ` 07/07/04 07:03:23 07:03:23
IN
1 07/07/04 07:15:45 07:23:23
OUT
1 07/07/04 07:43:22 07:43:22
IN

I can't figure out how to do this. I group by [CLIENTNUM], then
[DATE], then [BOUNDARYSTAT] but the [BOUNDARYSTAT] groups for the
whole day. In other words my table looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:43:22
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 ` 07/07/04 07:03:23 07:43:23
IN
1 07/07/04 07:15:45 07:23:23
OUT



Thanks for any help
 
W

Willem

I reformatted the tables so that they fit(I hope).........

Hi there,

I have a table like this:

ID CLIENT DATE TIME LAT LONG BOUND

1 1 06/07 03:03 52.22 120.33 IN
2 1 06/07 03:15 52.55 120.44 IN
3 1 06/07 03:23 52.88 120.88 OUT
4 1 06/07 03:43 52.44 120.34 IN
5 1 07/07 07:03 52.22 120.33 IN
6 1 07/07 07:15 52.55 120.77 OUT
7 1 07/07 07:23 52.88 120.88 OUT
8 1 07/07 07:43 52.44 120.34 IN


I want to group it so that it looks like this:

CLIENT DATE FIRST LAST BOUND
1 06/07 03:03 03:15 IN
1 06/07 03:23 03:23 OUT
1 06/07 03:43 03:43 IN
1 ` 07/07 07:03 07:03 IN
1 07/07 07:15 07:23 OUT
1 07/07 07:43 07:43 IN

I can't figure out how to do this. I group by [CLIENTNUM], then
[DATE], then [BOUNDARYSTAT] but the [BOUNDARYSTAT] groups for the
whole day. In other words my table looks like this:

CLIENT DATE FIRST LAST BOUND
1 06/07 03:03 03:43 IN
1 06/07 03:23 03:23 OUT
1 ` 07/07 07:03 07:43 IN
1 07/07 07:15 07:23 OUT
Thanks for any help
 
T

Tom Ellison

Dear Willem:

Your columns for ID, Latitude, and Longitude do not figure into the
results, so I'm excluding them so I can better study the problem.

CLIENTNUM DATE TIME BOUNDARYSTAT
1 06/07/04 03:03:23 IN
1 06/07/04 03:15:45 IN
1 06/07/04 03:23:23 OUT
1 06/07/04 03:43:22 IN
1 07/07/04 07:03:23 IN
1 07/07/04 07:15:45 OUT
1 07/07/04 07:23:23 OUT
1 07/07/04 07:43:22 IN

From this, it appears you want to group by CLIENTNUM, DATE, and
BOUNDARYSTAT, showing the minimum and maximum times for those groups.
The query would be:

SELECT CLIENTNUM, [DATE], MIN([TIME]) AS FIRSTTIME,
MAX([TIME] AS LASTTIME, BOUNDARYSTAT
FROM YourTable
GROUP BY CLIENTNUM, [DATE], BOUNDARYSTAT

I suspect you may have used FIRST() and LAST() instead of MIN() and
MAX(). For your purposes, the FIRST() and LAST() functions often give
virtually random results. That isn't what you would want, I expect.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts
 
W

Willem

Tom,

I have tried what you suggest but still have the same problem.
Basically the [BoundaryStat] grouping is taking the results from the
entire day and not from each change of state, so to speak.

In one day the movement for one client out of a boundary is circle is:

from 03:03:23 to 03:15:45 the client was in the circle
from 03:23:23 to 03:23:23 the client was out of the circle
from 03:43:22 to 03:43:22 the client was back in the circle.

Using the query you suggested this is how the movement was reported:

from 03:03:23 to 03:43:22 the client was in the circle
from 03:23:23 to 03:23:23 the client was out of the circle

I hope this makes it a little clearer. Thank you for your help so
far.

Willem
CLIENTNUM DATE TIME BOUNDARYSTAT
1 06/07/04 03:03:23 IN
1 06/07/04 03:15:45 IN
1 06/07/04 03:23:23 OUT
1 06/07/04 03:43:22 IN
1 07/07/04 07:03:23 IN
1 07/07/04 07:15:45 OUT
1 07/07/04 07:23:23 OUT
1 07/07/04 07:43:22 IN

Dear Willem:

Your columns for ID, Latitude, and Longitude do not figure into the
results, so I'm excluding them so I can better study the problem.

CLIENTNUM DATE TIME BOUNDARYSTAT
1 06/07/04 03:03:23 IN
1 06/07/04 03:15:45 IN
1 06/07/04 03:23:23 OUT
1 06/07/04 03:43:22 IN
1 07/07/04 07:03:23 IN
1 07/07/04 07:15:45 OUT
1 07/07/04 07:23:23 OUT
1 07/07/04 07:43:22 IN

From this, it appears you want to group by CLIENTNUM, DATE, and
BOUNDARYSTAT, showing the minimum and maximum times for those groups.
The query would be:

SELECT CLIENTNUM, [DATE], MIN([TIME]) AS FIRSTTIME,
MAX([TIME] AS LASTTIME, BOUNDARYSTAT
FROM YourTable
GROUP BY CLIENTNUM, [DATE], BOUNDARYSTAT

I suspect you may have used FIRST() and LAST() instead of MIN() and
MAX(). For your purposes, the FIRST() and LAST() functions often give
virtually random results. That isn't what you would want, I expect.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts


Hi there,

I have a table like this:

ID CLIENTNUM DATE TIME LATITUDE
LONGITUDE BOUNDARYSTAT

1 1 06/07/04 03:03:23 52.22233
120.33442 IN
2 1 06/07/04 03:15:45 52.55666
120.44555 IN
3 1 06/07/04 03:23:23 52.88333
120.88383 OUT
4 1 06/07/04 03:43:22 52.44332
120.34343 IN
5 1 07/07/04 07:03:23 52.22233
120.33442 IN
6 1 07/07/04 07:15:45 52.55666
120.77483 OUT
7 1 07/07/04 07:23:23 52.88333
120.88383 OUT
8 1 07/07/04 07:43:22 52.44332
120.34343 IN


I want to group it so that it looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:15:45
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 06/07/04 03:43:22 03:43:22
IN
1 ` 07/07/04 07:03:23 07:03:23
IN
1 07/07/04 07:15:45 07:23:23
OUT
1 07/07/04 07:43:22 07:43:22
IN

I can't figure out how to do this. I group by [CLIENTNUM], then
[DATE], then [BOUNDARYSTAT] but the [BOUNDARYSTAT] groups for the
whole day. In other words my table looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:43:22
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 ` 07/07/04 07:03:23 07:43:23
IN
1 07/07/04 07:15:45 07:23:23
OUT



Thanks for any help
 
T

Tom Ellison

Dear Willem:

Are you simply wanting to see every row of the data? That's really
too simple, so I'm going to take a guess at what I think maybe you
want. I think maybe what you want is to see only those rows where the
BOUNDARYSTAT changes from its previous value. Is that the case?

According to your more recent statement, the client was IN from
03:03:23 to 03:15:45 and OUT at 03:23:23. I guess you don't know
whether it is IN or OUT between 03:15:45 and 03:23:23 but it is OUT at
03:23:23, then undefined again from 03:23:23 to 03:43:22, when it was
momentarily IN, then undefined again after that.

It's difficult to visualize a physical system that relates to this
analysis of the information.

It's also peculiar that the client is not IN from 06/04/04 03:43:22 to
07/07/04 07:03:23, since these are consecutive datapoints, just on
different days.

I would like to work on a way to query this for you, but the way you
want it to work seems so unnatural to me that I'm concerned I would be
wasting my time without some insight into what is going on. Could you
enlighten me a bit here before I tackle this?

Also, are your dates represented DD/MM/YY? In the sample, are the two
dates one day apart, or a month apart?

I'm also thinking that having two separate columns for the date/time
may be a difficulty. I'll save that concern for later.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts


Tom,

I have tried what you suggest but still have the same problem.
Basically the [BoundaryStat] grouping is taking the results from the
entire day and not from each change of state, so to speak.

In one day the movement for one client out of a boundary is circle is:

from 03:03:23 to 03:15:45 the client was in the circle
from 03:23:23 to 03:23:23 the client was out of the circle
from 03:43:22 to 03:43:22 the client was back in the circle.

Using the query you suggested this is how the movement was reported:

from 03:03:23 to 03:43:22 the client was in the circle
from 03:23:23 to 03:23:23 the client was out of the circle

I hope this makes it a little clearer. Thank you for your help so
far.

Willem
CLIENTNUM DATE TIME BOUNDARYSTAT
1 06/07/04 03:03:23 IN
1 06/07/04 03:15:45 IN
1 06/07/04 03:23:23 OUT
1 06/07/04 03:43:22 IN
1 07/07/04 07:03:23 IN
1 07/07/04 07:15:45 OUT
1 07/07/04 07:23:23 OUT
1 07/07/04 07:43:22 IN

Dear Willem:

Your columns for ID, Latitude, and Longitude do not figure into the
results, so I'm excluding them so I can better study the problem.

CLIENTNUM DATE TIME BOUNDARYSTAT
1 06/07/04 03:03:23 IN
1 06/07/04 03:15:45 IN
1 06/07/04 03:23:23 OUT
1 06/07/04 03:43:22 IN
1 07/07/04 07:03:23 IN
1 07/07/04 07:15:45 OUT
1 07/07/04 07:23:23 OUT
1 07/07/04 07:43:22 IN

From this, it appears you want to group by CLIENTNUM, DATE, and
BOUNDARYSTAT, showing the minimum and maximum times for those groups.
The query would be:

SELECT CLIENTNUM, [DATE], MIN([TIME]) AS FIRSTTIME,
MAX([TIME] AS LASTTIME, BOUNDARYSTAT
FROM YourTable
GROUP BY CLIENTNUM, [DATE], BOUNDARYSTAT

I suspect you may have used FIRST() and LAST() instead of MIN() and
MAX(). For your purposes, the FIRST() and LAST() functions often give
virtually random results. That isn't what you would want, I expect.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts


Hi there,

I have a table like this:

ID CLIENTNUM DATE TIME LATITUDE
LONGITUDE BOUNDARYSTAT

1 1 06/07/04 03:03:23 52.22233
120.33442 IN
2 1 06/07/04 03:15:45 52.55666
120.44555 IN
3 1 06/07/04 03:23:23 52.88333
120.88383 OUT
4 1 06/07/04 03:43:22 52.44332
120.34343 IN
5 1 07/07/04 07:03:23 52.22233
120.33442 IN
6 1 07/07/04 07:15:45 52.55666
120.77483 OUT
7 1 07/07/04 07:23:23 52.88333
120.88383 OUT
8 1 07/07/04 07:43:22 52.44332
120.34343 IN


I want to group it so that it looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:15:45
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 06/07/04 03:43:22 03:43:22
IN
1 ` 07/07/04 07:03:23 07:03:23
IN
1 07/07/04 07:15:45 07:23:23
OUT
1 07/07/04 07:43:22 07:43:22
IN

I can't figure out how to do this. I group by [CLIENTNUM], then
[DATE], then [BOUNDARYSTAT] but the [BOUNDARYSTAT] groups for the
whole day. In other words my table looks like this:

CLIENTNUM DATE FIRSTTIME LASTTIME
BOUNDARYSTAT
1 06/07/04 03:03:23 03:43:22
IN
1 06/07/04 03:23:23 03:23:23
OUT
1 ` 07/07/04 07:03:23 07:43:23
IN
1 07/07/04 07:15:45 07:23:23
OUT



Thanks for any help
 
P

Pike

Hello,

I'm using S2k to illustrate a solution and I've
added some additional data.I'm being expedient
with the date(s) as that is a detail not penitent
to the problem.It's only necessary that the data
be understandable:)I have also omitted the LAT
and LONG columns for the same reason.

create table #Willem([ID] int primary key,CLIENT int,
[DATE] char(5),[TIME] char(5),BOUND varchar(3))
go
insert #Willem values(1,1,'06/07','03:03','IN')
insert #Willem values(2,1,'06/07','03:15','IN')
insert #Willem values(3,1,'06/07','03:23','OUT')
insert #Willem values(4,1,'06/07','03:43','IN')
insert #Willem values(5,1,'07/07','07:03','IN')
insert #Willem values(6,1,'07/07','07:15','OUT')
insert #Willem values(7,1,'07/07','07:23','OUT')
insert #Willem values(8,1,'07/07','07:43','IN')
-- Additional data for client 2
insert #Willem values(9,2,'06/07','03:03','IN')
insert #Willem values(10,2,'06/07','03:15','IN')
insert #Willem values(11,2,'06/07','03:23','OUT')
insert #Willem values(12,2,'06/07','03:43','IN')
insert #Willem values(13,2,'07/07','06:23','IN')
insert #Willem values(14,2,'07/07','07:15','OUT')

This is a good example of why most procedural developers
think sql s..ks:)
With the data given there is no combination of columns
(as you have found out) that you can GROUP BY that will
take into account the sequence of BOUND within CLIENT and DATE.
Without taking the order of BOUND into account its therefore
not possible to use a GROUP BY query to obtain your result.
But if you could simply *glue* together the order of
BOUND within CLIENT and DATE then you could easily use a
GROUP BY query for the solution.

Here's the data with some *glue* (the Brank column) taking into
account the sequence of the BOUND column.
Call it table ##Willem (a S2k global temporary table):

select * from ##Willem order by [ID]

ID CLIENT DATE TIME BOUND *Brank*
----------- ----------- ----- ----- ----- -----------
1 1 06/07 03:03 IN 1
2 1 06/07 03:15 IN 1
3 1 06/07 03:23 OUT 2
4 1 06/07 03:43 IN 3
5 1 07/07 07:03 IN 4
6 1 07/07 07:15 OUT 5
7 1 07/07 07:23 OUT 5
8 1 07/07 07:43 IN 6
9 2 06/07 03:03 IN 1
10 2 06/07 03:15 IN 1
11 2 06/07 03:23 OUT 2
12 2 06/07 03:43 IN 3
13 2 07/07 06:23 IN 4
14 2 07/07 07:15 OUT 5

Brank is an integer counter or rank.It's actual values are not
important.What is important is that BOUND's within CLIENT and DATE
that have the same value (ie. IN or OUT) have the same rank
and when a BOUND changes, Brank is changed (incremented by some
value (ie. 1)).It is not even necessary that Brank be reset to
1 when the CLIENT changes.It could just as well increment ascending
over the entire table.You can think of Brank as a virtual grouping
column derived from the table.
Now, Brank can be used in a GROUP BY query to obtain the solution.
Obviously Brank is necessary for the solution (grouping) but not
part of it, ie. it is not needed as a returned column in Select.

Select CLIENT,[DATE],Min([TIME]) as [FIRST],
Max([TIME]) as [LAST],BOUND
From ##Willem
Group By CLIENT,[DATE],Brank,BOUND
Order by CLIENT,Brank

Solution:

CLIENT DATE FIRST LAST BOUND
----------- ----- ----- ----- -----
1 06/07 03:03 03:15 IN
1 06/07 03:23 03:23 OUT
1 06/07 03:43 03:43 IN
1 07/07 07:03 07:03 IN
1 07/07 07:15 07:23 OUT
1 07/07 07:43 07:43 IN
2 06/07 03:03 03:15 IN
2 06/07 03:23 03:23 OUT
2 06/07 03:43 03:43 IN
2 07/07 06:23 06:23 IN
2 07/07 07:15 07:15 OUT

Visulization of this type of solution is one thing,
*simply* generating the Brank column is another:)
Given either S2k or Access, an sql query to generate
Brank is not that simple.It would be much easier to do with
cursors/procedurally.With Yukon this type of rank should be simpler
to obtain using new sql features or the CLR.Unfortuneately,I
don't think Access will have the new sql functions:(:)
Alternatively, Brank can be easily obtained on S2k using the RAC utility.

Here is the RAC execute statement to obtain table ##Willem:

Exec Rac @transform='_dummy_',
@rows='CLIENT & [DATE] & BOUND & [TIME] & [ID]',
@rowsort='[ID]',------or CLIENT & [DATE] & TIME & [ID]
@pvtcol='Report Mode',
@from='#Willem',
@grand_totals='n',@rowbreak='n',
-- The @rowindicators parameter constructs the rank column (Brank)
-- based on BOUND within CLIENT AND [DATE] (as specified in
-- the @rows parameter).The columns are sorted by [ID] assuming
-- [ID] reflects the true ordering of the data.If [ID] is arbitrary
-- then the table could be sorted by CLIENT,[DATE],[TIME] and [ID].
-- Brank is essentially a byproduct of the group/partition of
-- CLIENT,[DATE] and BOUND ordered from left to right or by [ID] if
-- it is equivalent.This is how sql99 functions will work in Yukon.
-- They should make life easier by eliminating obsolete,complicated,
-- resource intensive and non-scaling sql sql89/92 coding.
@rowindicators='BOUND{Brank}',@counterdatatype='int',
-- You can check out the RAC Help file for all the features of
-- its @select parameter.
@select='Select 1*[ID] as [ID],1*CLIENT as CLIENT,[DATE],
[TIME],BOUND,Brank into ##Willem
from rac
order by rd'

Notice no messy sql code necessary:)

Perhaps someone can give you an Access method of obtaining
Brank or,better yet, a simpler solution:)
I have posted ways of doing this using Access sql.If no one
jumps in perhaps I'll dig around.But be warned its not
pretty, inefficient and nothing that will scale:)
It's examples like this that have many MS users waiting for Yukon.
But regardless, there's always RAC:).In fact the whole problem
could be solved within a single RAC execute by simply moving
the GROUP BY query to the RAC @select.

Have you given any thought to migrating your data to at least
S2k MSDE and simply linking your tables into Access?:)

HTH

RAC v2.2 and QALite @
www.rac4sql.net
 
W

Willem

Steve,

Holy cow - you understand. Thank you, thank you, thank you - for
just understanding! I was afraid that I was at a loss to explain it
clearly and you pegged me correctly as a procedural programmer and
thinker.

Yes I do find SQL to sort of s_ck! But I'm learning to like it a
little more, bit by bit. What I ended up doing prior to getting this
response from you was something fairly simliar, but by way of VBA. It
is, however, very very slow.

I'm not familiar with Yukon or RAC - and basically brute learn my way
through manipulating A2K whenever I need to work with some databases.
You have provided me with some more information that will end up
saving me a lot of time.

I have considered moving to MSDE or MySQL as I expect my data to grow
quite drastically within the next year. My access database is now
sitting at around 200 mb and growing by 150mb a month. However, I
expect next year to be working in GB's.

I'm finding that A2K really slows down when I need to perform SQL
operations on larger tables. For instance, if I take one big table
and run a simple sort query on a calculated field expression it takes
about 1 hour. Yet, if I divide that same table into 10 parts and run
each part individually it only take about 4 minutes per table. So
doing it in 10 parts takes a 40 minutes, while 1 huge part takes an
hour or simply crashes my computer altogether.

As you can probably gather I'm new to intensive database design and
operations. One great thing with A2K is using VBA to run procedures
on my data, where SQL cannot perform in the same way.

With MSDE or MySQL can I have the same amount of procedural
programming control over the data like I can in A2K?

Thanks again,

Willem

Hello,

I'm using S2k to illustrate a solution and I've
added some additional data.I'm being expedient
with the date(s) as that is a detail not penitent
to the problem.It's only necessary that the data
be understandable:)I have also omitted the LAT
and LONG columns for the same reason.

create table #Willem([ID] int primary key,CLIENT int,
[DATE] char(5),[TIME] char(5),BOUND varchar(3))
go
insert #Willem values(1,1,'06/07','03:03','IN')
insert #Willem values(2,1,'06/07','03:15','IN')
insert #Willem values(3,1,'06/07','03:23','OUT')
insert #Willem values(4,1,'06/07','03:43','IN')
insert #Willem values(5,1,'07/07','07:03','IN')
insert #Willem values(6,1,'07/07','07:15','OUT')
insert #Willem values(7,1,'07/07','07:23','OUT')
insert #Willem values(8,1,'07/07','07:43','IN')
-- Additional data for client 2
insert #Willem values(9,2,'06/07','03:03','IN')
insert #Willem values(10,2,'06/07','03:15','IN')
insert #Willem values(11,2,'06/07','03:23','OUT')
insert #Willem values(12,2,'06/07','03:43','IN')
insert #Willem values(13,2,'07/07','06:23','IN')
insert #Willem values(14,2,'07/07','07:15','OUT')

This is a good example of why most procedural developers
think sql s..ks:)
With the data given there is no combination of columns
(as you have found out) that you can GROUP BY that will
take into account the sequence of BOUND within CLIENT and DATE.
Without taking the order of BOUND into account its therefore
not possible to use a GROUP BY query to obtain your result.
But if you could simply *glue* together the order of
BOUND within CLIENT and DATE then you could easily use a
GROUP BY query for the solution.

Here's the data with some *glue* (the Brank column) taking into
account the sequence of the BOUND column.
Call it table ##Willem (a S2k global temporary table):

select * from ##Willem order by [ID]

ID CLIENT DATE TIME BOUND *Brank*
----------- ----------- ----- ----- ----- -----------
1 1 06/07 03:03 IN 1
2 1 06/07 03:15 IN 1
3 1 06/07 03:23 OUT 2
4 1 06/07 03:43 IN 3
5 1 07/07 07:03 IN 4
6 1 07/07 07:15 OUT 5
7 1 07/07 07:23 OUT 5
8 1 07/07 07:43 IN 6
9 2 06/07 03:03 IN 1
10 2 06/07 03:15 IN 1
11 2 06/07 03:23 OUT 2
12 2 06/07 03:43 IN 3
13 2 07/07 06:23 IN 4
14 2 07/07 07:15 OUT 5

Brank is an integer counter or rank.It's actual values are not
important.What is important is that BOUND's within CLIENT and DATE
that have the same value (ie. IN or OUT) have the same rank
and when a BOUND changes, Brank is changed (incremented by some
value (ie. 1)).It is not even necessary that Brank be reset to
1 when the CLIENT changes.It could just as well increment ascending
over the entire table.You can think of Brank as a virtual grouping
column derived from the table.
Now, Brank can be used in a GROUP BY query to obtain the solution.
Obviously Brank is necessary for the solution (grouping) but not
part of it, ie. it is not needed as a returned column in Select.

Select CLIENT,[DATE],Min([TIME]) as [FIRST],
Max([TIME]) as [LAST],BOUND
From ##Willem
Group By CLIENT,[DATE],Brank,BOUND
Order by CLIENT,Brank

Solution:

CLIENT DATE FIRST LAST BOUND
----------- ----- ----- ----- -----
1 06/07 03:03 03:15 IN
1 06/07 03:23 03:23 OUT
1 06/07 03:43 03:43 IN
1 07/07 07:03 07:03 IN
1 07/07 07:15 07:23 OUT
1 07/07 07:43 07:43 IN
2 06/07 03:03 03:15 IN
2 06/07 03:23 03:23 OUT
2 06/07 03:43 03:43 IN
2 07/07 06:23 06:23 IN
2 07/07 07:15 07:15 OUT

Visulization of this type of solution is one thing,
*simply* generating the Brank column is another:)
Given either S2k or Access, an sql query to generate
Brank is not that simple.It would be much easier to do with
cursors/procedurally.With Yukon this type of rank should be simpler
to obtain using new sql features or the CLR.Unfortuneately,I
don't think Access will have the new sql functions:(:)
Alternatively, Brank can be easily obtained on S2k using the RAC utility.

Here is the RAC execute statement to obtain table ##Willem:

Exec Rac @transform='_dummy_',
@rows='CLIENT & [DATE] & BOUND & [TIME] & [ID]',
@rowsort='[ID]',------or CLIENT & [DATE] & TIME & [ID]
@pvtcol='Report Mode',
@from='#Willem',
@grand_totals='n',@rowbreak='n',
-- The @rowindicators parameter constructs the rank column (Brank)
-- based on BOUND within CLIENT AND [DATE] (as specified in
-- the @rows parameter).The columns are sorted by [ID] assuming
-- [ID] reflects the true ordering of the data.If [ID] is arbitrary
-- then the table could be sorted by CLIENT,[DATE],[TIME] and [ID].
-- Brank is essentially a byproduct of the group/partition of
-- CLIENT,[DATE] and BOUND ordered from left to right or by [ID] if
-- it is equivalent.This is how sql99 functions will work in Yukon.
-- They should make life easier by eliminating obsolete,complicated,
-- resource intensive and non-scaling sql sql89/92 coding.
@rowindicators='BOUND{Brank}',@counterdatatype='int',
-- You can check out the RAC Help file for all the features of
-- its @select parameter.
@select='Select 1*[ID] as [ID],1*CLIENT as CLIENT,[DATE],
[TIME],BOUND,Brank into ##Willem
from rac
order by rd'

Notice no messy sql code necessary:)

Perhaps someone can give you an Access method of obtaining
Brank or,better yet, a simpler solution:)
I have posted ways of doing this using Access sql.If no one
jumps in perhaps I'll dig around.But be warned its not
pretty, inefficient and nothing that will scale:)
It's examples like this that have many MS users waiting for Yukon.
But regardless, there's always RAC:).In fact the whole problem
could be solved within a single RAC execute by simply moving
the GROUP BY query to the RAC @select.

Have you given any thought to migrating your data to at least
S2k MSDE and simply linking your tables into Access?:)

HTH

RAC v2.2 and QALite @
www.rac4sql.net
 
G

Gary Walter

Hi Willem,

Here might be one Access way:

I saved Steve's data in a table "tW"

I could not help but think that [Date] and [Time]
strings should be a real DateTime, so I first set up
a query to get that at least computed a real DateTime.

qW:

SELECT tW.ID, tW.Client,
tW.Date As strDate,
tW.Time As strTime,
tW.BoundaryStat AS Bound,
CDate([Date])+TimeValue(CDate([Time])) AS DatTim
FROM tW;

This will probably fail depending on your Regional settings,
but my guess is that you have the skill to create a "DatTim"
from your strings.

You talked about slow queries, so I am going to assume
you have a lot of data and use a table to store data then
update it (you could use this method for the query that
takes so long -- save calculating query to a table w/o
sorting, then return data from this table w/sorting).

I created a table "tblBRank"

ID Number
Client Number
strDate Text
strTime Text
Bound Text
PrevBound Text
DatTim Date/Time
BRank Number

Each time I run this process, I first
delete all records in tblBRank:

DELETE * FROM tblBRank;

Next, I found the previous Bound
and appended current data plus this
"PrevBound" to tblBRank.

qappPrevBound:

INSERT INTO tblBRank
( ID, Client, strDate, strTime, DatTim, Bound, PrevBound )
SELECT qW.ID, qW.Client, qW.strDate,
qW.strTime, qW.DatTim, qW.Bound,
(SELECT P.Bound FROM qW As P
WHERE P.Client=qW.Client
AND
P.DatTim =DMax("DatTim","qW","[DatTim]<#"
& qW.DatTim & "# AND [Client]=" & qW.Client)) AS PrevBound
FROM qW;

producing (odds of this not word-wrapping are very low):

ID Client strDate strTime DatTim Bound PrevBound BRank
1 1 06/07 03:03 6/7/2004 3:03:00 AM IN 0
2 1 06/07 03:15 6/7/2004 3:15:00 AM IN IN 0
3 1 06/07 03:23 6/7/2004 3:23:00 AM OUT IN 0
4 1 06/07 03:43 6/7/2004 3:43:00 AM IN OUT 0
5 1 07/07 07:03 7/7/2004 7:03:00 AM IN IN 0
6 1 07/07 07:15 7/7/2004 7:15:00 AM OUT IN 0
7 1 07/07 07:23 7/7/2004 7:23:00 AM OUT OUT 0
8 1 07/07 07:43 7/7/2004 7:43:00 AM IN OUT 0
10 2 06/07 03:03 6/7/2004 3:03:00 AM IN 0
11 2 06/07 03:15 6/7/2004 3:15:00 AM IN IN 0
12 2 06/07 03:23 6/7/2004 3:23:00 AM OUT IN 0
13 2 06/07 03:43 6/7/2004 3:43:00 AM IN OUT 0
14 2 07/07 06:23 7/7/2004 6:23:00 AM IN IN 0
15 2 07/07 07:15 7/7/2004 7:15:00 AM OUT IN 0

The subquery would not have to be so complicated if one
could assume that "ID" is "truly sequential in time," but I did
not know if that would always be true.

I next saved the following function in a module:

Public Function fStaticCount(pBound As Variant, _
pPrevBound As Variant, _
Optional pReset As Boolean = False) As Long
On Error GoTo Err_fStaticCount
Static lngCnt As Long

If pReset = True Then
lngCnt = 0
Exit Function
End If

If pBound = pPrevBound Then
'same string
Else
lngCnt = Nz(lngCnt, 0) + 1
End If

fStaticCount = lngCnt

Exit_fStaticCount:
Exit Function

Err_fStaticCount:
MsgBox Err.Description
Resume Exit_fStaticCount
End Function

You don't have to, but before you run
the next step in the process, you could
reset the static count with the following
query:

qryReset:

SELECT fStaticCount("A","B",-1) AS Expr1
FROM tW
WHERE 1=0;

Next in the process, we run an update query on tblBRank
to compute BRank:

qupdBRank:

UPDATE tblBRank
SET tblBRank.BRank = fStaticCount([Bound],[PrevBound]);

Using Steve's data, your tblBRank would then be:

ID Client strDate strTime DatTim Bound PrevBound BRank
1 1 06/07 03:03 6/7/2004 3:03:00 AM IN 1
2 1 06/07 03:15 6/7/2004 3:15:00 AM IN IN 1
3 1 06/07 03:23 6/7/2004 3:23:00 AM OUT IN 2
4 1 06/07 03:43 6/7/2004 3:43:00 AM IN OUT 3
5 1 07/07 07:03 7/7/2004 7:03:00 AM IN IN 3
6 1 07/07 07:15 7/7/2004 7:15:00 AM OUT IN 4
7 1 07/07 07:23 7/7/2004 7:23:00 AM OUT OUT 4
8 1 07/07 07:43 7/7/2004 7:43:00 AM IN OUT 5
10 2 06/07 03:03 6/7/2004 3:03:00 AM IN 6
11 2 06/07 03:15 6/7/2004 3:15:00 AM IN IN 6
12 2 06/07 03:23 6/7/2004 3:23:00 AM OUT IN 7
13 2 06/07 03:43 6/7/2004 3:43:00 AM IN OUT 8
14 2 07/07 06:23 7/7/2004 6:23:00 AM IN IN 8
15 2 07/07 07:15 7/7/2004 7:15:00 AM OUT IN 9

Now you can run your GroupBy query on tblBRank:

qryMovement:

SELECT
tblBRank.Client,
tblBRank.strDate,
Min(tblBRank.strTime) AS FirstTime,
Max(tblBRank.strTime) AS LastTime,
tblBRank.Bound
FROM tblBRank
GROUP BY
tblBRank.Client,
tblBRank.strDate,
tblBRank.Bound,
tblBRank.BRank
ORDER BY
tblBRank.Client,
tblBRank.strDate,
Min(tblBRank.strTime);

producing:

Client strDate FirstTime LastTime Bound
1 06/07 03:03 03:15 IN
1 06/07 03:23 03:23 OUT
1 06/07 03:43 03:43 IN
1 07/07 07:03 07:03 IN
1 07/07 07:15 07:23 OUT
1 07/07 07:43 07:43 IN
2 06/07 03:03 03:15 IN
2 06/07 03:23 03:23 OUT
2 06/07 03:43 03:43 IN
2 07/07 06:23 06:23 IN
2 07/07 07:15 07:15 OUT


So...the "process" would be:

'clear tblBRank
CurrentDb.Execute "DELETE * FROM tblBRank;",dbFailOnError
'append data plus prevbound to tblBRank
CurrentDb.Execute "qappPrevBound", dbFailOnError
'reset static count
CurrentDb.Execute "qryReset", dbFailOnError
'update tblBRank w/ BRank
CurrentDb.Execute "qupdBRank", dbFailOnError
'now use qryMovement however you needed it

That might be one way using Steve's BRank.

Good luck,

Gary Walter
 
G

Gary Walter

If it is the case that you *can* depend
on ID to "be sequential," then your query
might be simplified to (untested):

qappPrevBound:

INSERT INTO tblBRank
( ID, Client, strDate, strTime, DatTim, Bound, PrevBound )
SELECT qW.ID, qW.Client, qW.strDate,
qW.strTime, qW.DatTim, qW.Bound,
(SELECT P.Bound FROM qW As P
WHERE P.Client=qW.Client
AND
P.ID =DMax("ID","qW","[ID]<" & qW.ID
& " AND [Client]=" & qW.Client)) AS PrevBound
FROM qW;
 
P

Pike

Gary Walter said:
Hi Willem,
[snipe]
qappPrevBound:

INSERT INTO tblBRank
( ID, Client, strDate, strTime, DatTim, Bound, PrevBound )
SELECT qW.ID, qW.Client, qW.strDate,
qW.strTime, qW.DatTim, qW.Bound,
(SELECT P.Bound FROM qW As P
WHERE P.Client=qW.Client
AND
P.DatTim =DMax("DatTim","qW","[DatTim]<#"
& qW.DatTim & "# AND [Client]=" & qW.Client)) AS PrevBound
FROM qW;
[snipe]

Kudos on your intrinsic Access solution.

As I alluded to in my post, the Good news about sql is
the complexity of constructing a solution.The Bad news
is that a working solution is beyond the limits of the query
optimizer.This is the built in clandestine trap door that is so
easy for developers to fall into.Sql traps exist in many types
of problems but they especially stand out in ranking problems.
Using sql where you have a correlated subquey with an
inequality operator can lead to ogles of comparisons.There
is no efficient way for the optimizer to do this regardless of indexing.
This type of query/subquery is the only way to attack the problem
in this context.Your DMax expression is an example of this.
Sql99 alleviates the inefficiency not by improving on the
optimization of such queries but by-passing them entirely.
Sql99 takes a more *procedural* approach which will probably
both help and at the same time confuse developers I fear.
But that is another subject:)

For some background on this topic you can check out:
http://www.winnetmag.com/SQLServer/Articles/ArticleID/42302/pg/1/1.html
http://www.winnetmag.com/SQLServer/Articles/ArticleID/42646/pg/1/1.html

RAC v2.2 and QALite @
www.rac4sql.net
 
G

Gary Walter

Is [snipe] a reserved word? :cool:

Willem,

Switch to MSDE/SQL Server and use RAC
if at all possible!

I wasted a bunch of time this weekend thinking
about your problem, and when Steve posted about
Brank, I set out to see it through. Maybe I should
not have posted it, maybe there was a better way....
obviously, there is a *better* way (see above).

Steve could probably whip out a "magic crosstab"
but he probably has thought it out and knows it too
will not be able to be optimized (I assume).

"Pike",

Thank you for the links.

I will check them out.

Gary Walter

"When I despair, I remember that all through
history the way of truth and love has always won.
There have been (Bushes and Cheneys and Murdochs)
and for a time they seem invincible
but in the end, they always fall....always."

-- Mahatma Gary
 
P

Pike

Willem said:
.
With MSDE or MySQL can I have the same amount of procedural
programming control over the data like I can in A2K?

Hi,

This is a huge subject and goes to the essence of what's
happening in the db world.MS describes it as *leverage*.
You just have to separate the hype from the reality.No matter
what, you should continue learning how to design normalized
db's and working on your sql skills.MS's plan is to leverage your
procedural skills in their world of .net and Yukon.There you will
be doing basically the same thing your doing now in Access with
vba.Retriving some data via some form of sql statement then
processing it procedurally.But with .net and Yukon your processing
will take place within the server as opposed to outside it.This
they hope will be more efficient.Note they are not giving you
the ability to get rid of sql but to get rid of t-sql constructs (looping,
etc) and use the .net language for that (just like your doing now
with vba).Will MS's new vision really work any better than Access/vba
linked with MSDE tables?Who knows:)I think their about a year
away from RTM and half that for more stable and more featured
enabled betas.
Of course MS isn't the only game going on.Open source stuff
(MySql/Postgresql) will be players.MySql shows really interesting
potential.They are also at least a year away from really showing off
all the improvements they've been working on IMO.But even now
MySql is a viable alternative to Jet.IBM's DB2 could actually beat
MS to the punch with .net server side processing.And I haven't
even touched on the 'sql' feature sets in MySql,Oracle,etc.By feature
set I don't only mean standard sql but specific functionality to
solve specific problems (like the Access crosstab).
Like it said its a huge subject.But do keep in mind that
all vendors have developers like you in mind.
Post back if you want to go into more details:)
 
G

Gary Walter

Thank you again Steve for the good links.

I finally had a chance to read them.

Itzik Ben-Gan does another good job
of distilling and explaining.

Does the term "BRank" have any "significance"
(like "DRank" for Dense_Rank)?

(snip)
 
P

Pike

Gary Walter said:
Thank you again Steve for the good links.
If you liked that you'll love reading about this stuff
in Oracle/DB2.Go to Oracle site and load their sql
documentation.Seach for dense_rank().Post back
if u need links.
Does the term "BRank" have any "significance"
(like "DRank" for Dense_Rank)?
:)
@rowindicators in RAC simulates sql99 dense_rank().
Whats in the { } is the user supplied column name.I simply
used 'Brank'.

steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top