Average Sales Query.

  • Thread starter Thread starter Chris M
  • Start date Start date
C

Chris M

Hi Group.

I'm sure this should be straight forward, but I just can't get my head round
the most efficient way to do it...

I have 2 tables.

PARTS(14,000 Records)
PART_NO DESCRIPTION

SALES(120,000 Records)
PART_NO SALES_DATE SALE_PRICE

What I want is a query to show the following:

PART_NO DESCRIPTION AVG_SALE

Where AVG_SALE is the average of the last 3 SALES (ie from the SALES table
for that PART_NO for the highest 3 SALES_DATEs)

Any suggestions gratefully received.

Chris.
 
Hi Group.
I'm sure this should be straight forward, but I just can't get my head
round the most efficient way to do it...

I have 2 tables.

PARTS(14,000 Records)
PART_NO DESCRIPTION

SALES(120,000 Records)
PART_NO SALES_DATE SALE_PRICE

What I want is a query to show the following:

PART_NO DESCRIPTION AVG_SALE

Where AVG_SALE is the average of the last 3 SALES (ie from the SALES table
for that PART_NO for the highest 3 SALES_DATEs)

Any suggestions gratefully received.

Chris.



*** Forgot to say, SALES table also has a KEY field which is a unique ID for
each record.
 
Chris M said:
*** Forgot to say, SALES table also has a KEY field which is a unique ID
for each record.

This is my best effort so far, but it is toooo sllloooowwww...

SELECT PARTS.PART_NO, PARTS.PART_DESC, ROUND(Avg(SA.SALE_PRICE),2) AS
SP,count(SA.SALE_PRICE) as SALES
FROM PARTS, SALES AS SA
WHERE (((SA.PART_NO) Like "CC*") AND ((PARTS.PART_NO)=[SA].[PART_NO]) AND
((SA.Key) In (SELECT TOP 3 SB.KEY FROM SALES as SB
WHERE SB.PART_NO = SA.PART_NO
ORDER BY SB.SALE_DATE DESC
)))
GROUP BY PARTS.PART_NO, PARTS.PART_DESC;

Chris.
 
Do you want the last three sales or do you want the sales on the last three
dates? Those can be different things depending on your data.

Whichever, I would try the following to speed up the query.

First, use an INNER JOIN instead of a Cartesian Join with a where clause.
Second, check that there are indexes on Sale_Date and Both Part_No fields
(you automatically do if they are part of a defined relationship)

SELECT PARTS.PART_NO, PARTS.PART_DESC,
ROUND(Avg(SA.SALE_PRICE),2) AS SP,
Count(SA.SALE_PRICE) as SALES
FROM PARTS INNER JOIN SALES AS SA
ON Parts.Part_No = SA.Part_No

WHERE SA.PART_NO Like "CC*" AND
SA.Key In (SELECT TOP 3 SB.KEY FROM SALES as SB
WHERE SB.PART_NO = SA.PART_NO
ORDER BY SB.SALE_DATE DESC)

GROUP BY PARTS.PART_NO, PARTS.PART_DESC;

Also, you do realize that your subquery can return more than 3 records if
there are ties on the date. If you want to strictly limit this to three
records you may need to add SB.Key to your ORDER BY Clause in the subquery.
Chris M said:
Chris M said:
*** Forgot to say, SALES table also has a KEY field which is a unique ID
for each record.

This is my best effort so far, but it is toooo sllloooowwww...

SELECT PARTS.PART_NO, PARTS.PART_DESC, ROUND(Avg(SA.SALE_PRICE),2) AS
SP,count(SA.SALE_PRICE) as SALES
FROM PARTS, SALES AS SA
WHERE (((SA.PART_NO) Like "CC*") AND ((PARTS.PART_NO)=[SA].[PART_NO]) AND
((SA.Key) In (SELECT TOP 3 SB.KEY FROM SALES as SB
WHERE SB.PART_NO = SA.PART_NO
ORDER BY SB.SALE_DATE DESC
)))
GROUP BY PARTS.PART_NO, PARTS.PART_DESC;

Chris.
 
Chris M said:
Chris M said:
Hi Group.

I'm sure this should be straight forward, but I just can't get my head
round the most efficient way to do it...

I have 2 tables.

PARTS(14,000 Records)
PART_NO DESCRIPTION

SALES(120,000 Records)
PART_NO SALES_DATE SALE_PRICE

What I want is a query to show the following:

PART_NO DESCRIPTION AVG_SALE

Where AVG_SALE is the average of the last 3 SALES (ie from the SALES
table for that PART_NO for the highest 3 SALES_DATEs)

Any suggestions gratefully received.

Chris.



*** Forgot to say, SALES table also has a KEY field which is a unique ID
for each record.

This is my best effort so far, but it is toooo sllloooowwww...

SELECT PARTS.PART_NO, PARTS.PART_DESC, ROUND(Avg(SA.SALE_PRICE),2) AS
SP,count(SA.SALE_PRICE) as SALES
FROM PARTS, SALES AS SA
WHERE (((SA.PART_NO) Like "CC*") AND ((PARTS.PART_NO)=[SA].[PART_NO]) AND
((SA.Key) In (SELECT TOP 3 SB.KEY FROM SALES as SB
WHERE SB.PART_NO = SA.PART_NO
ORDER BY SB.SALE_DATE DESC
)))
GROUP BY PARTS.PART_NO, PARTS.PART_DESC;

Chris.

John Spencer said:
Do you want the last three sales or do you want the sales on the last
three dates? Those can be different things depending on your data.

Whichever, I would try the following to speed up the query.

First, use an INNER JOIN instead of a Cartesian Join with a where clause.
Second, check that there are indexes on Sale_Date and Both Part_No fields
(you automatically do if they are part of a defined relationship)

SELECT PARTS.PART_NO, PARTS.PART_DESC,
ROUND(Avg(SA.SALE_PRICE),2) AS SP,
Count(SA.SALE_PRICE) as SALES
FROM PARTS INNER JOIN SALES AS SA
ON Parts.Part_No = SA.Part_No

WHERE SA.PART_NO Like "CC*" AND
SA.Key In (SELECT TOP 3 SB.KEY FROM SALES as SB
WHERE SB.PART_NO = SA.PART_NO
ORDER BY SB.SALE_DATE DESC)

GROUP BY PARTS.PART_NO, PARTS.PART_DESC;

Also, you do realize that your subquery can return more than 3 records if
there are ties on the date. If you want to strictly limit this to three
records you may need to add SB.Key to your ORDER BY Clause in the
subquery.


John, thanks for your answer.
I thought TOP only returned the TOP n records. Is there an explanation
somewhere of exactly how it works when there are ties? In this case, it's
not important, as if the average is over 4 records and not 3, I really don't
mind.

Btw, your query runs in 1:57 against 2:01 for my original. (for ALL parts,
ie ' WHERE SA.PART_NO Like "*" ')

A virtual pint(or your drink of choice) for anyone that can substantially
reduce this time. (by improving the query, not by suggesting I upgrade my
memory/processor etc.)

Cheers,

Chris.
 
I thought TOP only returned the TOP n records. Is there an explanation
somewhere of exactly how it works when there are ties? In this case, it's

Search online help for
ALL, DISTINCT, DISTINCTROW, TOP Predicates

Here is an extract from the online help

TOP n [PERCENT] Returns a certain number of records that fall at the top or
the bottom of a range specified by an ORDER BY clause. Suppose you want the
names of the top 25 students from the class of 1994:
SELECT TOP 25
FirstName, LastName
FROM Students
WHERE GraduationYear = 1994
ORDER BY GradePointAverage DESC;

If you don't include the ORDER BY clause, the query will return an arbitrary
set of 25 records from the Students table that satisfy the WHERE clause.

===========================================
The TOP predicate doesn't choose between equal values. In the preceding
example, if the twenty-fifth and twenty-sixth highest grade point averages
are the same, the query will return 26 records.
===========================================
The value that follows TOP must be an unsigned Integer.TOP doesn't affect
whether or not the query is updatable.

By the way, did you check the indexes?

Other than that, I'm not sure there is much you can do to speed up the
query.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top