Removing duplicates from a query

S

Support

I have a query that looks at linked tables and outputs customers who have
bought certain brands. The query works but I am getting duplicate results
where the customer has bought more than one product of the same brand.
What would be the best method of removing the duplicates from the result so
that it just shows one instance of the customer? For this query, I am not
interested in the actual product bought, just the customer ID. I have a
field withn the reults table showing the customer_id.
I've run the simple find duplicates query where it is searching for a
duplicate customer_id but this isn't really what I want as it shows which
ones are dupes. What I'd like is for the output to show me my original
query showing just one instance of the customer_id. If further info is
needed, please feel free to ask!
thanks
 
G

Guest

If you want the customer to show up once for each brand, you have two options

1. In the Query Properties, for Unique Values, select Yes
2. Click the Totals button to show the Total row. The Total row will say "Group By" for each field of the query

If you want the customer to show up once and only once, even if they've bought more than one of the brands you are selecting for, click the Totals button to show the Total row. In the field for the brand (and any fields referring to the table with the brand) in the Total row, select "First"

----- Support wrote: ----

I have a query that looks at linked tables and outputs customers who hav
bought certain brands. The query works but I am getting duplicate result
where the customer has bought more than one product of the same brand
What would be the best method of removing the duplicates from the result s
that it just shows one instance of the customer? For this query, I am no
interested in the actual product bought, just the customer ID. I have
field withn the reults table showing the customer_id
I've run the simple find duplicates query where it is searching for
duplicate customer_id but this isn't really what I want as it shows whic
ones are dupes. What I'd like is for the output to show me my origina
query showing just one instance of the customer_id. If further info i
needed, please feel free to ask
thank
 
S

Support

Hi Corinna and thanks for your help.
I ran the last suggestion:
"If you want the customer to show up once and only once, even if they've
bought more than one of the brands you are selecting for, click the Totals
button to show the Total row. In the field for the brand (and any fields
referring to the table with the brand) in the Total row, select "First"."
However I ran into some problems. I ran the query with this in and
retrieved 105 results. I then ran the same query without the totals in and
retrieved 215 results.
I decided to check this and using the duplicates wizard, I counted the
number of duplicates which only came to 38 so somewhere it is not seeing 60
odd records. Do you know why this might be? Here are the 2 queries:

With totals on (ATB):

SELECT tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName,
tblFinal.c_Email, First(tblAllorderlines.l_OrderLineProductDescription) AS
FirstOfl_OrderLineProductDescription, tblFinal.c_SourceCustomerId
FROM (tblFinal INNER JOIN tblOrders ON tblFinal.c_SourceCustomerId =
tblOrders.o_SourceCustomerID) INNER JOIN tblAllorderlines ON
tblOrders.o_OrderNumber = tblAllorderlines.l_OrderNumber
GROUP BY tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName,
tblFinal.c_Email, tblFinal.c_SourceCustomerId
HAVING (((First(tblAllorderlines.l_OrderLineProductDescription)) Like
"*scrub*" Or (First(tblAllorderlines.l_OrderLineProductDescription)) Like
"*grit*"));

Without totals(ATB No totals):

SELECT tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName,
tblFinal.c_Email, tblAllorderlines.l_OrderLineProductDescription,
tblFinal.c_SourceCustomerId
FROM (tblFinal INNER JOIN tblOrders ON
tblFinal.c_SourceCustomerId=tblOrders.o_SourceCustomerID) INNER JOIN
tblAllorderlines ON tblOrders.o_OrderNumber=tblAllorderlines.l_OrderNumber
WHERE (((tblAllorderlines.l_OrderLineProductDescription) Like "*scrub*" Or
(tblAllorderlines.l_OrderLineProductDescription) Like "*grit*"));

And the duplicates query (which is ran on the no totals query directly
above):

SELECT [ATB No totals].c_SourceCustomerId, [ATB No totals].c_FirstName, [ATB
No totals].c_LastName
FROM [ATB No totals]
WHERE ((([ATB No totals].c_SourceCustomerId) In (SELECT [c_SourceCustomerId]
FROM [ATB No totals] As Tmp GROUP BY [c_SourceCustomerId] HAVING
Count(*)>1 )))
ORDER BY [ATB No totals].c_SourceCustomerId;

thanks
 
G

Guest

If I understand you correctly, your results are actually pretty logical. The query tool generates an SQL statement (the actual query language) from what it sees in the query tool. When you take out the Total row, you eliminate the "Group By" clause and the "First" aggregate function on the l_OrderLineProductDescription field from the resulting SQL statement. (FYI - in SQL, you aren't allowed to have an aggregate function like "First" without the "Group By" clause on all non-aggregated fields.

This means two things are actually happening, only one of which your duplicates query is going to show. Removing the "First" function allows a customer that bought both "*scrub*" and "*grit*" to show up twice, once for each brand. If you remove the "First" function in the Totals row (just change it to "Group By") and run the duplicates query, you shouldn't get any duplicates. Removing the "Group By" clause (by removing/hiding the Totals row) allows a customer that bought a brand multiple times to show up multiple times in the query, which the duplicates query will show

So you're 60-odd missing records are likely customers that bought both "*scrub*" and "*grit*"

Corrina Burnle
Burnley Data Solution
www.burnleydatasolutions.co

----- Support wrote: ----

Hi Corinna and thanks for your help
I ran the last suggestion
"If you want the customer to show up once and only once, even if they'v
bought more than one of the brands you are selecting for, click the Total
button to show the Total row. In the field for the brand (and any field
referring to the table with the brand) in the Total row, select "First".
However I ran into some problems. I ran the query with this in an
retrieved 105 results. I then ran the same query without the totals in an
retrieved 215 results
I decided to check this and using the duplicates wizard, I counted th
number of duplicates which only came to 38 so somewhere it is not seeing 6
odd records. Do you know why this might be? Here are the 2 queries

With totals on (ATB)

SELECT tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName
tblFinal.c_Email, First(tblAllorderlines.l_OrderLineProductDescription) A
FirstOfl_OrderLineProductDescription, tblFinal.c_SourceCustomerI
FROM (tblFinal INNER JOIN tblOrders ON tblFinal.c_SourceCustomerId
tblOrders.o_SourceCustomerID) INNER JOIN tblAllorderlines O
tblOrders.o_OrderNumber = tblAllorderlines.l_OrderNumbe
GROUP BY tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName
tblFinal.c_Email, tblFinal.c_SourceCustomerI
HAVING (((First(tblAllorderlines.l_OrderLineProductDescription)) Lik
"*scrub*" Or (First(tblAllorderlines.l_OrderLineProductDescription)) Lik
"*grit*"))

Without totals(ATB No totals)

SELECT tblFinal.c_Title, tblFinal.c_FirstName, tblFinal.c_LastName
tblFinal.c_Email, tblAllorderlines.l_OrderLineProductDescription
tblFinal.c_SourceCustomerI
FROM (tblFinal INNER JOIN tblOrders O
tblFinal.c_SourceCustomerId=tblOrders.o_SourceCustomerID) INNER JOI
tblAllorderlines ON tblOrders.o_OrderNumber=tblAllorderlines.l_OrderNumbe
WHERE (((tblAllorderlines.l_OrderLineProductDescription) Like "*scrub*" O
(tblAllorderlines.l_OrderLineProductDescription) Like "*grit*"))

And the duplicates query (which is ran on the no totals query directl
above)

SELECT [ATB No totals].c_SourceCustomerId, [ATB No totals].c_FirstName, [AT
No totals].c_LastNam
FROM [ATB No totals
WHERE ((([ATB No totals].c_SourceCustomerId) In (SELECT [c_SourceCustomerId
FROM [ATB No totals] As Tmp GROUP BY [c_SourceCustomerId] HAVIN
Count(*)>1 ))
ORDER BY [ATB No totals].c_SourceCustomerId

thank
 
S

Support

Thanks again Corrina,
that's really helped me understand what is actually going on within the
queries.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top