Identifying Duplicates in Access ?

D

Davie

Lets assume I have a table with name, address, postcode in it.


id | name | address1 | address2 | postcode

2 jimbo 3 highstreet london lh34 4tn
45 jimbo 3 high street london lh43 4tn


Assume these are two duplicate users, albeit their address is different,
what is the easiest way of determining they are duplicates?
 
G

Guest

Hi Davie,

The easiest way is to use the Find Duplicates Query Wizard under New Query
in the Database window.
Regards.
 
K

Kjell Harnesk

Lets assume I have a table with name, address, postcode in it.

id | name | address1 | address2 | postcode

2 jimbo 3 highstreet london lh34 4tn
45 jimbo 3 high street london lh43 4tn

Assume these are two duplicate users, albeit their address is different,
what is the easiest way of determining they are duplicates?

Two people with same name but with different addresses can't be duplicates.
I think it´s illegal to clone people. In the example above the addresses
are the same and one of the records can be deleted because even if they in
fact represent two diffrent people the are the same datawise if there are
no other fields telling them apart.
 
J

John Marshall, MVP

Finding duplicates in name/address is not an easy task. There are a number
of issues that make an address appear different to a computer, but the
average user can spot them as being the same. You should pass the
information through a filter to remove some of the problems.
1) Mixed case -> change a copy of the data to all upper case or all lower
case.
2) Remove white space -> the fields may contain blanks or tabs within the
text or before and after.
3) Standardize on common words -> street to st, road to rd, private to pvt
etc

You can then use this temporary field (or fields) to find duplicates. You
may want to keep the fields seperate. In your example, three of the four
fields match after a cleanup. Only the postcode differs. The same would be
true of one of the address was missing the postcode. A three out of four
match may be close enough to be considered as a possible match.

Even if you do find duplicates it is possible that they are different people
living at the same address. (father /son)

John... Visio MVP
 
K

Kjell Harnesk

Lets assume I have a table with name, address, postcode in it.

id | name | address1 | address2 | postcode

2 jimbo 3 highstreet london lh34 4tn
45 jimbo 3 high street london lh43 4tn

Assume these are two duplicate users, albeit their address is different,
what is the easiest way of determining they are duplicates?

Sorry, I din not notice the small difference in the postcode.
But as John stated, it´s a difficult task and You can never be quite sure
without checking with the person(s) themselves.
 
G

Guest

What if I don't have access to the Duplicates Wizard and want to first: Get a
numerical value for the number of duplicate addresses in a table then second:
eliminate the duplicates so only a singel record for that address remains?
 
J

John Vinson

On Tue, 14 Feb 2006 13:31:26 -0800, "Campaign Staffer" <Campaign
What if I don't have access to the Duplicates Wizard and want to first: Get a
numerical value for the number of duplicate addresses in a table then second:
eliminate the duplicates so only a singel record for that address remains?

Create a Totals Query. Create a new query based on your table; select
the Primary Key field and whatever other fields jointly identify a
duplicate (e.g. you don't want to just use LastName to identify
duplicates because two people might have the same last name - or even
firstname, lastname and phonenumber, as in Joe Doakes and Joe Doakes
Jr.)

Group By the fields which constitute the dups, and select Count on the
totals row under the Primary Key. Put a criterion of >1 on this field
if you don't want to see singletons, only duplicates.

If there are additional fields which can legitimately be different
while still considering the record to be a duplicate, you'll need to
decide WHICH near-duplicate you want to keep. It may be simplest to
just use a continuous Form sorted by the duplicate-defining fields and
manually delete; or you can use a Delete Query using the duplicates
query as a subquery. This can be a bit tricky to set up and have it be
updateable; more details of your table structure and criteria for
deletion would be helpful!

John W. Vinson[MVP]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top