Smartin said:
As an alternative to Allen's suggestion, and if you aren't concerned
about returning a record for every hour if no one was present, you can
try the method given by Douglas Steele in his reply to my post in this
thread:
http://preview.tinyurl.com/yaveee
If "not returning a record for every hour" were a requirement and the
OP had a single timestamp column, no CROSS JOIN (formerly cartesian
product) is required (as your link implies); rather, you just have to
GROUP BY hour and, unlike in your case (where you had to find values
within an hour of _each other_), we need to find the start of the
_current hour block_ e.g. consider this quick example:
CREATE TABLE Test (
date_col DATETIME NOT NULL
);
INSERT INTO Test (date_col) VALUES (#2007-01-01 09:01:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-01 10:01:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-01 10:02:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-01 11:01:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-01 11:02:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-01 11:03:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-02 09:01:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-02 10:01:00#);
INSERT INTO Test (date_col) VALUES (#2007-01-02 10:02:00#);
SELECT DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#, date_col),
#1990-01-01 00:00:00#), COUNT(*)
FROM test
GROUP BY DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#, date_col),
#1990-01-01 00:00:00#);
However, unlike you, the OP doesn't have single timestamps; rather, he
has periods i.e. start- and end-date pairs. First, the predicates for
sequenced queries are more complex e.g. this to find the intersection
of other periods for each start date:
CREATE TABLE Test2 (
start_date DATETIME NOT NULL,
end_date DATETIME NOT NULL,
CHECK(start_date < end_date)
)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 09:01:00#, #2007-01-01 09:02:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 10:01:00#, #2007-01-01 10:02:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 10:02:00#, #2007-01-01 11:03:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 11:01:00#, #2007-01-01 11:02:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 11:02:00#, #2007-01-01 12:03:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-01 11:03:00#, #2007-01-01 13:04:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-02 09:01:00#, #2007-01-02 09:02:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-02 10:01:00#, #2007-01-02 10:02:00#)
;
INSERT INTO Test2 (start_date, end_date)
VALUES (#2007-01-02 10:02:00#, #2007-01-02 11:03:00#)
;
SELECT DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T1.start_date), #1990-01-01 00:00:00#), COUNT(*)
FROM Test2 AS T1, Test2 AS T2
WHERE IIF(DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T1.start_date), #1990-01-01 00:00:00#) > DATEADD('H', DATEDIFF('H',
#1990-01-01 00:00:00#, T2.start_date), #1990-01-01 00:00:00#),
DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#, T1.start_date),
#1990-01-01 00:00:00#), DATEADD('H', DATEDIFF('H', #1990-01-01
00:00:00#, T2.start_date), #1990-01-01 00:00:00#)) < IIF(DATEADD('H',
DATEDIFF('H', #1990-01-01 00:00:00#, T1.start_date), #1990-01-01
00:59:59#) > DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T2.end_date), #1990-01-01 00:59:59#), DATEADD('H', DATEDIFF('H',
#1990-01-01 00:00:00#, T2.end_date), #1990-01-01 00:59:59#),
DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#, T1.start_date),
#1990-01-01 00:59:59#))
GROUP BY DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T1.start_date), #1990-01-01 00:00:00#)
Not only are the above predicates more complex, it returns only part of
the result: the OP requires a count for *every* hour period between
start_date and end_date, not just the first hour. In procedural terms
we need to iterate hourly between start_date and end_date. SQL is not a
procedural language; rather it is set based. Therefore, the preferred
solution is to use a set of hourly periods to which we can JOIN, hence
the Calendar table:
CREATE TABLE Sequence (
seq INTEGER NOT NULL UNIQUE
)
;
INSERT INTO Sequence (seq) VALUES (1)
;
INSERT INTO [Sequence] (seq)
SELECT Units.nbr + Tens.nbr
FROM
(
SELECT nbr FROM
(
SELECT 0 AS nbr FROM [Sequence]
UNION ALL
SELECT 1 FROM [Sequence]
UNION ALL
SELECT 2 FROM [Sequence]
UNION ALL
SELECT 3 FROM [Sequence]
UNION ALL
SELECT 4 FROM [Sequence]
UNION ALL
SELECT 5 FROM [Sequence]
UNION ALL
SELECT 6 FROM [Sequence]
UNION ALL
SELECT 7 FROM [Sequence]
UNION ALL
SELECT 8 FROM [Sequence]
UNION ALL
SELECT 9 FROM [Sequence]
) AS Digits
) AS Units,
(
SELECT nbr * 10 AS nbr FROM
(
SELECT 0 AS nbr FROM [Sequence]
UNION ALL
SELECT 1 FROM [Sequence]
UNION ALL
SELECT 2 FROM [Sequence]
UNION ALL
SELECT 3 FROM [Sequence]
UNION ALL
SELECT 4 FROM [Sequence]
UNION ALL
SELECT 5 FROM [Sequence]
UNION ALL
SELECT 6 FROM [Sequence]
UNION ALL
SELECT 7 FROM [Sequence]
UNION ALL
SELECT 8 FROM [Sequence]
UNION ALL
SELECT 9 FROM [Sequence]
) AS Digits
) AS Tens
WHERE Units.nbr + Tens.nbr > 1
;
CREATE TABLE Calendar (
start_date DATETIME NOT NULL,
end_date DATETIME NOT NULL
)
;
INSERT INTO Calendar (start_date, end_date) VALUES
(#2007-01-01 00:00:00#, #2007-01-01 00:59:59#)
;
INSERT INTO Calendar (start_date, end_date)
SELECT DATEADD('H', S1.seq, C1.start_date),
DATEADD('H', S1.seq, C1.end_date)
FROM Calendar AS C1, Sequence AS S1
;
After the one-off effort to create the Calendar table, sequenced
queries are easier to write e.g. the OP's required results without
showing the period with a zero count:
SELECT C1.start_date, COUNT(*)
FROM Test2 AS T1, Calendar AS C1
WHERE IIF(DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T1.start_date), #1990-01-01 00:00:00#) > C1.start_date, DATEADD('H',
DATEDIFF('H', #1990-01-01 00:00:00#, T1.start_date), #1990-01-01
00:00:00#), C1.start_date) < IIF(DATEADD('H', DATEDIFF('H', #1990-01-01
00:00:00#, T1.end_date), #1990-01-01 00:59:59#) > C1.end_date,
C1.end_date, DATEADD('H', DATEDIFF('H', #1990-01-01 00:00:00#,
T1.end_date), #1990-01-01 00:59:59#))
GROUP BY C1.start_date;
Jamie.