HSalim said:
James.
As requested, I have emailed you a rar archive containing a compressed
file.
I changed the file extension to .md_ , so remember to change it back to
.mdb after extracting the file.
Regards
HS
HSalim[MVP] wrote:
This is an interesting query and a fellow MVP brought it to my attention
as a fun challenge.
I've created a sample database with about 850 records that would
represent the data below. This will help test various scenarios. I've
consolidated the date and time portions of Start and End dates into one
field each, partly to simplify the query , partly because in my reading
of Michanker's post of 3/15 9.53 AM it seems that the data may indeed
be stored that way.
I also created a small procedure that would analyze the data as I had
mentioned in my earlier post, but I have not spent much time on trying
to do this in just one or more queries.
I suspect that it may not be possible to do this using just access
queries, but that is what got me interested in this thread.
Working with one set of sample records might be helpful to exchange
ideas and I'm happy to share the one I created.
I do not have a public site to exchange files, so if you are interested,
send me an email and I'll send it out
Regards
HS
I'd love to have a nice data file to test this out, especially as a gauge
of its efficiency. Please send a copy to: MPAPoster at FortuneJames dot
com
I'll also try sending you an email.
Here are a few thoughts I had since this morning:
I signed with (e-mail address removed) instead of
(e-mail address removed)
This was because I was linking through a university account and had to
use Google to post. Normally, I use Google to post to
Comp.Databases.MS-Access and Mozilla to post to microsoft.public.access
and got confused by the context.
Because of the change to qryTimeNodes it may be possible to get a pair of
records with the same TimeNode value but different StartTimeNode values.
If that occurs it doesn't matter since the Duration will be 0 and no
contribution will be made to Coverage either way.
Using [Start Date] + [Start Time], although relatively safe, is still
somewhat implemention dependent. Perhaps I should use built-in functions
when adding a time to a date. Anyway, I'm glad you've consolidated them.
Thanks,
James A. Fortune
(e-mail address removed)
Some interesting homemade videos (none by me):
http://www.youtube.com/browse?s=mf
The test file (tblSchedule) contained 850 records with an ID that I named
RecordID and start and end times called Start and End covering a period of
just over three months starting on 12/29/04 and ending on 3/1/05. The
first thing I noticed when importing the file was that an interval usually
covered the transition from one month to the next. Any time the month
boundary is spanned, a TimeNode containing the beginning of the month
needs to be included. I also made RecordID a primary key and indexed the
other fields. Without those auxiliary month start values the translation
query is something like:
SELECT RecordID, Start, End, Year([Start]) & Format(Month([Start]),'00')
AS theMonth INTO MyTimeTable FROM tblSchedule;
I'm guessing that the Coverage for January and February is the test. I
tried my earliest query:
SELECT RecordID, theMonth, Start AS TimeNode FROM MyTimeTable UNION SELECT
RecordID, theMonth, End FROM MyTimeTable AS TimeNode UNION SELECT 1 AS
RecordID, theMonth, DateSerial(Left([theMonth], 4), Right([theMonth], 2),
1) FROM MyTimeTable AS TimeNode ORDER BY theMonth, TimeNode;
1703 records came up.
Next I tried the query I posted:
qryTimeNodesPosted:
SELECT theMonth, Start AS TimeNode, -1 As StartTimeNode FROM MyTimeTable
UNION SELECT theMonth, End AS TimeNode, 1 As StartTimeNode FROM
MyTimeTable ORDER BY theMonth, TimeNode;
Without the auxiliary values for the month start it produced 1518 values.
Any end times past the end of the month can use the end of the month
instead when calculating durations. IIf might solve that but it's another
potential slow down.
Note that the running sum of StartTimeNode values should never be greater
than 0.
To get that to happen correctly (qryTimeNodesPosted lost duplicate
TimeNodes) I adjusted qryTimeNodesPosted to include the RecordID so that
the DISTINCT effect of the Union Query would not toss them away:
qryTimeNodes:
SELECT RecordID, theMonth, Start AS TimeNode, -1 As StartTimeNode FROM
MyTimeTable UNION SELECT RecordID, theMonth, End AS TimeNode, 1 As
StartTimeNode FROM MyTimeTable ORDER BY theMonth, TimeNode;
I tried just a running sum and convinced myself that, although not fully
optimized, I have the means to get a solution within a reasonable amount
of time.
qryRunningSum:
SELECT theMonth, TimeNode, (SELECT SUM(A.StartTimeNode) FROM qryTimeNodes
AS A WHERE A.TimeNode <= qryTimeNodesPosted.TimeNode AND A.theMonth =
qryTimeNodesPosted.theMonth) AS RunningSumStartTimeNode FROM
qryTimeNodesPosted;
The first few records of qryRunningSum look like:
theMonth TimeNode RunningSumStartTimeNode
200412 12/29/04 1:30:00 AM -1
200412 12/29/04 2:15:01 AM -2
200412 12/29/04 3:15:01 AM -3
200412 12/29/14 4:15:01 AM -4
...
200412 1/2/05 4:30:18 PM -2
200412 1/2/05 4:30:18 PM -2
200412 1/2/05 4:45:18 PM -1
200412 1/2/05 11:45:20 PM 0
200501 1/1/05 1:30:00 AM -1
...
Using a running sum on StartTimeNode isn't very efficient. I'm starting
to understand the problem better now. One thing that I noticed from the
sample data is that since most of the intervals are multiply covered, the
number of times where the running sum is 0 should be small. When the
month is fully covered, the running sum is 0 only at the end. When the
count of the TimeNode values contributed by Start and End up to the
current value match up is where an uncovered duration can start. Maybe I
can get that information earlier. I can start to look for more efficient
ways to go about it rather than looking at all possible time durations. I
must say I was surprised at how quickly John Viescas' query ran on the
sample data. I think I can learn from John's example even after I've
optimized everything I can. This is a great problem.
James A. Fortune
(e-mail address removed)