S
Smartin
My task is to take a set of date ranges, determine how many days are
included in each span, and total them up for each client. The trick is,
there may be multiple records that span the same dates, and each day can
only be counted once. Moreover, there may be lapses in the ranges.
Example data for one client:
ID FROM_DTE TO_DTE ; comment
============================
1 1 OCT 7 OCT ; 7 days
2 8 OCT 14 OCT ; 7 days
3 1 OCT 14 OCT ; 14 days, but overlaps previous ranges
4 15 OCT 21 OCT ; 7 days
5 25 OCT 28 OCT ; 4 days, note lapse from previous period
The correct calculations on the data above would look like this:
ID SPAN
============
1 7
2 7
3 0 ; <-- do not count this span again
4 7
5 4
Result (sum): 25
For the end product I am not concerned about the line-by-line
calculations, only the final result.
I cannot simply subtract MIN(FROM_DTE) from MAX(TO_DTE) (which would
give a result of 28 days in the sample data) because of the potential
for lapses.
The only idea I have so far is to programmatically loop between each
FROM and TO date, stuff each date in to a temp table, and determine the
number of unique entries in the temp table. However I am not sure this
will be workable in real life as the production data has hundreds of
entries per client, tens of thousands of clients, and is recalculated on
a weekly cycle along with a ton of other metrics in the ETL.
Any ideas out there?
Thanks all,
included in each span, and total them up for each client. The trick is,
there may be multiple records that span the same dates, and each day can
only be counted once. Moreover, there may be lapses in the ranges.
Example data for one client:
ID FROM_DTE TO_DTE ; comment
============================
1 1 OCT 7 OCT ; 7 days
2 8 OCT 14 OCT ; 7 days
3 1 OCT 14 OCT ; 14 days, but overlaps previous ranges
4 15 OCT 21 OCT ; 7 days
5 25 OCT 28 OCT ; 4 days, note lapse from previous period
The correct calculations on the data above would look like this:
ID SPAN
============
1 7
2 7
3 0 ; <-- do not count this span again
4 7
5 4
Result (sum): 25
For the end product I am not concerned about the line-by-line
calculations, only the final result.
I cannot simply subtract MIN(FROM_DTE) from MAX(TO_DTE) (which would
give a result of 28 days in the sample data) because of the potential
for lapses.
The only idea I have so far is to programmatically loop between each
FROM and TO date, stuff each date in to a temp table, and determine the
number of unique entries in the temp table. However I am not sure this
will be workable in real life as the production data has hundreds of
entries per client, tens of thousands of clients, and is recalculated on
a weekly cycle along with a ton of other metrics in the ETL.
Any ideas out there?
Thanks all,