Generation of Queries to identify duplicates records in Access Dat

Guest · Apr 21, 2006

We have an Access 2000 Database with Duplicate records. These records have
unique identifiers for the individuals (person or company.

The database is a registry, where individuals can register different
products. The duplicates have typically been caused by incorrect data entry
or re-entering related information for the same client for a new
product,without first searching for the existing client information (e.g.
Address).

We need to be able to identify these records, Link the information and then
delete the duplicate.

We have hypothesized on various ways the duplicate data was caused and have
come up with queries for identifying most of the records. We are however
battling with queries or scripts to identify (& extract or Group) duplicate
records the following nature

1. Duplicates caused, where the Name is exactly the same, but there
are slight differences in the address field

ID-PERSON NAME ADDRESS
1092260 MR GERNTZ, 45 ST. BARRY ROAD, DURBAN, 3423.
1094784 MR GERNTZ, 45 BARRY ROAD, DURBAN, 3423.

2. Duplicates caused, where the Address is exactly the same, but there
are slight differences in the Name field

ID-PERSON NAME ADDRESS
932 BITE-SOFT C.C. 8 HILL TERACE
MILNERTON 7441
933 BITESOFT C.C. 8 HILL TERACE
MILNERTON 7441

3. Duplicates caused, where the Name is exactly the same, but in one
record the address field is a Postal address, while in the other record the
address is a physical address.

ID-PERSON NAME ADDRESS
2345 MICHEAL ENGELS KIGELIA STR 53 FLORA
PARKBENONI0760
2087 MICHEAL ENGELS P O BOX 5852 BENONI 0760

4. Duplicates caused, where the Address is exactly the same, but in one
record within the Name field abbreviations are used and not used in the Name
field of the other record.

ID-PERSON NAME ADDRESS
3445 MADUMAS (PROPRIETARY) LIMITED P O BOX 6052 CAPE TOWN0360
3487 MADUMAS (PTY) LTD P O BOX 6052 CAPE
TOWN 0360

Guest · Apr 21, 2006

Hi Be Gentle - we had a very similar problem and what I did was this:

Created a Find Duplicates Query using (for example) Name and Address but
displaying all fields. Deleted all absolute duplicates.
Then created a Find Duplicates Query using only Name but displaying all
fields and deleted a bunch. Lastly created a Find Duplicates Query by Address
only but displaying all fields and deleted the last group.
Then on my entry screen added a subform linked back to the same DB that
displays if the person is already on the system. (Actually the screen has 5
subforms looking up the data in 5 different ways from 3 different DB's). By
doing this we have had only a very small number of duplicates in the last 12
months.

Hope this gives you some help.

Yours -- Dika

Guest · Apr 21, 2006

Thanks for your suggestion but the problem is that we cannot just delete the
data,we need to extract it so the client can go and verify that they are
really duplicates. The challenge is to write a script to extract these
records :
2345 MICHEAL ENGELS KIGELIA STR 53 FLORA PARK BENONI 0760
2087 MICHEAL ENGELS P O BOX 5852 BENONI 0760

I used the following query and it still does not get me the result I want.

SELECT *
FROM Table
WHERE exists (select null from Table p
where p.name = Table.name
and (( p.address) like 'P*') or (( Table.address) not like 'P*')
group by p.address,p.name
having count (p.idperson) >1);

the following query gets me these records:

JOHN TERRY PIDGEON P.O. Box 95011 Waterkloof 0145
JOHN TERRY PIDGEON P.O. Box 95011 Waterkloof 0145

SELECT *
FROM Table
WHERE Exists (select null from Table p
where p.[name] = Table.[name]
and p.address = Table.address
group by p.[name]
having count (p.idperson)>1);

Generation of Queries to identify duplicates records in Access Dat

Guest

Guest

Guest