Finding Duplicates

J

Jason

Hi all,
I'm running Access 2007 and would like to know if this logic is correct for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe Patients].pat_number =
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure this.

Thanks.
 
G

Guest

First, you should rethink your database design. There is no need to have two
tables to reflect your Patient Renewals. You should just add a RenewalYear
field to your table, and combine the data from the two tables into a single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale
 
J

Jason

Hi Dale,
my data originally comes from one table, I just extracted the data into two
tables because I thought it would be easier to compare. I have the patient
name and demographic in one table and the service dates in another table. I
queried those tables and extracted all 2006 data that I needed into one
table and did the same for 2007.
Does the query that I provided not give accurate results, in your opinion?
I tried your example and Access actually crashed on me.

Jason


Dale Fye said:
First, you should rethink your database design. There is no need to have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi all,
I'm running Access 2007 and would like to know if this logic is correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure this.

Thanks.
 
J

John Spencer

Actually the poster should probably add a new table to store the clientid
and the activeYear. One record per patient per year in this table. Rather
than having all the client data repeated in each record with the exception
of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

Dale Fye said:
First, you should rethink your database design. There is no need to have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi all,
I'm running Access 2007 and would like to know if this logic is correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure this.

Thanks.
 
G

Guest

You're probably right on that. I didn't spend a lot of time looking at the
various fields in his two tables, just that he had two when one would have
sufficed.
--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


John Spencer said:
Actually the poster should probably add a new table to store the clientid
and the activeYear. One record per patient per year in this table. Rather
than having all the client data repeated in each record with the exception
of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

Dale Fye said:
First, you should rethink your database design. There is no need to have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi all,
I'm running Access 2007 and would like to know if this logic is correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure this.

Thanks.
 
G

Guest

Can you post the sql of the query you wrote in response to my reply. I'll
take a look and see if I can figure out why it crashed Access.

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi Dale,
my data originally comes from one table, I just extracted the data into two
tables because I thought it would be easier to compare. I have the patient
name and demographic in one table and the service dates in another table. I
queried those tables and extracted all 2006 data that I needed into one
table and did the same for 2007.
Does the query that I provided not give accurate results, in your opinion?
I tried your example and Access actually crashed on me.

Jason


Dale Fye said:
First, you should rethink your database design. There is no need to have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi all,
I'm running Access 2007 and would like to know if this logic is correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure this.

Thanks.
 
J

Jason

Hi Jon,
Do you mean add a table and the clientID and activeYear for both 2006 and
2007 both in the same table? If so, what would be the SQL I would need to
run? I still am not understanding how to accomplish this.

Thanks.

John Spencer said:
Actually the poster should probably add a new table to store the clientid
and the activeYear. One record per patient per year in this table.
Rather than having all the client data repeated in each record with the
exception of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
.

Dale Fye said:
First, you should rethink your database design. There is no need to have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10 years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal, otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


Jason said:
Hi all,
I'm running Access 2007 and would like to know if this logic is correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe
Patients].pat_number =
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure
this.

Thanks.
 
J

Jason

Dale,
here's the SQL that I wrote:

SELECT ar1ord.Pat_Number, Count(ar1ord.Pat_Number) as RenewalCount
FROM ar1ord
WHERE ar1ord.svc_date >= #12/31/2006# and ar1ord.svc_date <= #12/31/2007#;

The error says:
"You tried to execute a query that does not include the specified expression
'pat_number' as part of an aggregate function."
For that particular table, I know pat_number does exist, so I'm not sure
what else that error means.
Dale Fye said:
You're probably right on that. I didn't spend a lot of time looking at
the
various fields in his two tables, just that he had two when one would have
sufficed.
--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


John Spencer said:
Actually the poster should probably add a new table to store the clientid
and the activeYear. One record per patient per year in this table.
Rather
than having all the client data repeated in each record with the
exception
of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

Dale Fye said:
First, you should rethink your database design. There is no need to
have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10
years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal,
otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


:

Hi all,
I'm running Access 2007 and would like to know if this logic is
correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in
another
table). The logic would be to see which clients from last year are
still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe
Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure
this.

Thanks.
 
J

John Spencer

First of all, in response to your first question. The query you have
should work.

Now about table structure. What I was advising is that you have a
clients table (which you have) but not store the active year in the
clients table.

In the AnnualClients table you would have
ClientID (the value from the clients table)
YearNo (the number of the year)
(any other fields that relate to the year and client

Assuming that you restrict the table so you can only have one record per
client per year (a unique combined index on the ClientID plus YearNo)

SELECT Clients.*
FROM Clients
WHERE ClientID in
(SELECT A.ClientID
FROM AnnualClients as A
WHERE YearNo In (2005,2006)
GROUP BY A.ClientID
HAVING Count(YearNo) =2)

If you wanted the clients for 2005
ELECT Clients.*
FROM Clients INNER JOIN AnnualClients
ON Clients.ClientID = AnnualClients.ClientID
WHERE Clients.ClientID in
(SELECT A.ClientID
FROM AnnualClients as A
WHERE YearNo In (2005,2006)
GROUP BY A.ClientID
HAVING Count(YearNo) =1)
AND AnnualClients.YearNo = 2005



'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Hi Jon,
Do you mean add a table and the clientID and activeYear for both 2006
and 2007 both in the same table? If so, what would be the SQL I would
need to run? I still am not understanding how to accomplish this.

Thanks.

John Spencer said:
Actually the poster should probably add a new table to store the
clientid and the activeYear. One record per patient per year in this
table. Rather than having all the client data repeated in each record
with the exception of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
.

Dale Fye said:
First, you should rethink your database design. There is no need to
have two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10
years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal,
otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


:

Hi all,
I'm running Access 2007 and would like to know if this logic is
correct for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in another
table). The logic would be to see which clients from last year are
still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe
Patients].pat_number =
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure
this.

Thanks.
 
D

Dale Fye

It means that I forgot the GroupBy clause.

Add:

Group By ar1ord.Pat_Number to the end of the SQL

But your Where clause will only include records where the svc_Date is in
2007, so I don't think you will ever get a RenewalCount of greater than 1.
I think you need to extend the range from >12/31/05 to < 1/01/08.

HTH
Dale

Jason said:
Dale,
here's the SQL that I wrote:

SELECT ar1ord.Pat_Number, Count(ar1ord.Pat_Number) as RenewalCount
FROM ar1ord
WHERE ar1ord.svc_date >= #12/31/2006# and ar1ord.svc_date <= #12/31/2007#;

The error says:
"You tried to execute a query that does not include the specified
expression 'pat_number' as part of an aggregate function."
For that particular table, I know pat_number does exist, so I'm not sure
what else that error means.
Dale Fye said:
You're probably right on that. I didn't spend a lot of time looking at
the
various fields in his two tables, just that he had two when one would
have
sufficed.
--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


John Spencer said:
Actually the poster should probably add a new table to store the
clientid
and the activeYear. One record per patient per year in this table.
Rather
than having all the client data repeated in each record with the
exception
of a differenct active (renewal) year.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

First, you should rethink your database design. There is no need to
have
two
tables to reflect your Patient Renewals. You should just add a
RenewalYear
field to your table, and combine the data from the two tables into a
single
table. The advantage to this is that you can then have 3, 4, or 10
years
worth of data in a single table.

You could then do a query that looked something like:

SELECT Pat_Number, Count(Pat_Number) as RenewalCount
WHERE RenewalYear >= 2006 and RenewalYear <= 2007

If the RenewalCount field = 2 then you know you have a renewal,
otherwise
you either lost the customer, or gained a new one.

Dale

--
Don''t forget to rate the post if it was helpful!

Email address is not valid.
Please reply to newsgroup only.


:

Hi all,
I'm running Access 2007 and would like to know if this logic is
correct
for
searching duplicates in two tables.
I'm trying to find my renewal rate between all clients from last year
(stored in one table) and all clients from this year (stored in
another
table). The logic would be to see which clients from last year are
still
clients this year.
Here's the SQL statement:
--
SELECT [Patient Renewals - 2007 Shoe Patients].pat_number, [Patient
Renewals - 2007 Shoe Patients].f_name, [Patient Renewals - 2007 Shoe
Patients].l_name, [Patient Renewals - 2007 Shoe Patients].item_code,
[Patient Renewals - 2007 Shoe Patients].svc_date
FROM [Patient Renewals - 2006 Shoe Patients] INNER JOIN [Patient
Renewals -
2007 Shoe Patients] ON [Patient Renewals - 2006 Shoe
Patients].pat_number
=
[Patient Renewals - 2007 Shoe Patients].pat_number;
--

I get a result that seems like the script works, but just wanted to
double-check here and see if there's a more accurate way to measure
this.

Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top