Help constructing a tricky query

B

Bill Jaeck

I am looking for help building an efficient query that can return the daily
costs for my employees. I have a table with my employee names and their IDs,
and another one with the daily hours worked by each employee, and finally
another table that contains the hourly rates for my employees. My hourly
rate table sometimes contains multiple rates for a single employee, and the
effective date of that rate. That's because I am nice to my employees and I
give them raises as time goes on.



I have the following tables:



Table: Employees - contains employee names and their EmployeeIDs.

EmployeeID
Name

1
Jack Jones

2
John Doe

Etc.
Etc.




Table: Hours - For each day that each Employee works, a row is written to
this table which specifies who worked when and for how many hours.

EmployeeID
Date
Hours

1
1/5/2003
2

1
2/10/2003
3

1
3/10/2003
4

2
1/6/2003
3

3
1/7/2003
3

Etc.






Table: Rates - Each Employee works for a different hourly rate, and the
rates change over time. This table contains the effective rates for each
employee and the date that the rate went into effect. The rate is in effect
for the employee until a new rate with a more recent EffectiveDate is
entered into the table.

EmployeeID
EffectiveDate
Rate

1
1/1/2003
$30.00

1
2/1/2003
$35.00

1
3/1/2003
$40.00

2
1/2/2003
$20.00

3
1/3/2003
$25.00

Etc.






Here is the problem. I want to create a "Costs" query which combines the
rows in the Hours table with the effective rate for the employee on the date
the hours were worked so that a net cost can be calculated for the work done
each day by the employee, e.g.

Query: Costs

EmployeeID
Date
Hours
Rate
Cost

1
1/5/2003
2
$30.00
$60.00

1
2/10/2003
3
$35.00
$105.00

1
3/10/2003
4
$40.00
$160.00

2
1/6/2003
3
$20.00
$60.00

3
1/7/2003
3
$25.00
$75.00

etc








My Hours table is BIG. I realize that I could write a VBA function to figure
out the rate (e.g. a function called RateForEmployee(argEmployeeID as long,
argDate as Date) as Currency). But that function has to build an SQL string
and then open a recordset for each row in my Hours table, and that takes a
long time.



Does anyone know how I could build this query without having to write a
function in VBA? I am looking for a high performance solution that somehow
joins the Hours table with the Rates table to obtain the effective rate for
the date that the hours were worked. The effective rate would be the rate
corresponding the the greatest effective date that is less than or equal to
the date that the hours were worked.
 
J

Jamie Collins

Bill said:
Table: Rates - Each Employee works for a different hourly rate, and the
rates change over time. This table contains the effective rates for each
employee and the date that the rate went into effect. The rate is in effect
for the employee until a new rate with a more recent EffectiveDate is
entered into the table.

You have a design flaw here (as I think I mentioned in your
near-identical thread, 'Help writing a high performance query'). Your
Rates table is a 'history' table (more formerly, a 'valid-time state
table'). You are modelling 'rate period start date' (as EffectiveDate)
but you are missing 'rate period end date'. Because it is missing from
your model, you must use determine it using a subquery e.g.

SELECT T1.EmployeeID, T1.rate_amount, T1.EffectiveDate AS
rate_start_date, (
SELECT MIN(T2.EffectiveDate)
FROM Rates AS T1
WHERE T1.EmployeeID = T2.EmployeeID
AND T1.EffectiveDate < T2.EffectiveDate
FROM Rates AS T1
);

Using start date and end date pairs think this denormalizing. Actually,
it's not: because the start date and end dates are part of a single
atomic fact, thus both attributes should be in the same row (google for
the work of Snodgrass on temporal data in SQL DBMS). No matter, it
still makes your queries easier to write and perform better.

You could argue that at present you have a simpler design but it is
flawed. Consider in your sample data the row EmployeeID=1 AND
EffectiveDate=2003-02-02 is deleted, say due to a bug or user error.
With your design the data erroneously 'heals' itself and you are left
with a spurious period between 2003-02-02 and 2003-03-03. Such a bug
could be hard to detect. If you had modelled using start- and end date
pairs you would be left with a 'hole' in the data, which is the desired
effect.

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top