deleting duplicate records obtained from a duplicate query

  • Thread starter Thread starter Paul
  • Start date Start date
P

Paul

I used the "Duplicate Query" wizard in Access to display duplicate records
in a table. Is there a way to use that to create a Delete query that would
remove one of the records from each of the two duplicates? There are about
300 duplicates in a recordset of about 8,000 records, so it would be time
consuming to find each each one and delete them one at a time.

Thanks in advance,

Paul
 
You would probably want to use an update query to do that. A delete query is
normally used for deleting an entire table. If you have an id you can link
on, do that and then update the fields to null where there are dups.
 
Thanks for the suggestion, golfinray.

I do sometimes use delete queries to delete subsets of records specified by
criteria in the delete query.

Since I have duplicate records, what I'm looking for is a way to remove the
entire duplicate record from the table, but only one of the duplicate pair,
not both of them. It isn't clear to me how I accomplish that by updating
the fields to Null as you described.

???
 
One way to handle this would be to build a new table with a compound unique
index based on the fields that define duplication. Then import all the
records into this new table and ignore the errors. Once the import is
successful, delete the old table and rename the new table to the old table's name.

Another way, (BACK UP your data before you do this)
Build a query based on the table that will identify the primary key values you
want to keep and save that as qKeepThese . If you don't care which of the
"duplicate" records you want to keep then you can use the First aggregate
function to more-or-less randomly select one.

Query One: << This query is the key to identifying which records to keep>>
If you need to use multiple fields to identify the duplicates then add those
fields to the Group By clause.
SELECT First(PrimaryKeyField) as FirstID
FROM TheTable
GROUP BY TelephoneNumber


Query Two:
DELETE DistinctRow T.*
FROM TheTable as T
WHERE T.PrimaryKeyField IN
(SELECT PrimaryKeyField
FROM TheTable LEFT JOIN QKeepThese
ON TheTable.PrimaryKeyField= QKeepThese.FirstID
WHERE qKeepThese.FirstID is Null)

All in one query would be as follows - only works if field and table names
don't contain "Special" characters.

DELETE DistinctRow T.*
FROM TheTable as T
WHERE T.PrimaryKeyField IN
(SELECT PrimaryKeyField
FROM TheTable LEFT JOIN
(SELECT First(PrimaryKeyField) as FirstID
FROM TheTable
GROUP BY TelephoneNumber) AS QKeepThese
ON TheTable.PrimaryKeyField= QKeepThese.FirstID
WHERE qKeepThese.FirstID is Null)

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Thanks so much for the three solutions, John.

I used the first one which did the job and it saved me a lot of time. I'm
also going to experiment with the other two on a back up copy of the file
for my general knowledge.

Best regards,

Paul
 
Back
Top