Delete Duplicates with a Delete Query

N

NEWER USER

I have a table with 3.2 million records. In this table are 1000 duplicates
that I found with the Find Duplicates Query in Access 2003. I want to remove
500 and leave the other 500 in tact. So, I created a Totals Query and grouped
on six fields and added another field which is an ID field(autonumber/primary
key) to get a unique field in each record and set it to First. The query
returned expected results. I saved this query as "Query16".

I then created a Delete Query and pulled down the "*" fields to the grid
from the table "Catalog" containing the records I want to delete. I also
added all seven fields from the Catalog table to the grid and set the
Criteria to the Totals query. I linked the tables on Field1.

When I View the results, i get the 500 records that I want to delete.
However, when I try to run the actual Query, I get the Error Message 3086.
Could not delete from specified table. It is not opened as read-only.

DELETE Catalog.*, Catalog.Field1, Catalog.Field7, Catalog.Field8,
Catalog.Field9, Catalog.Field10, Catalog.Field13, Catalog.ID
FROM [Catalog] INNER JOIN Query16 ON Catalog.Field1 = Query16.Field1
WHERE (((Catalog.Field1)=[Query16].[Field1]) AND
((Catalog.Field7)=[Query16].[Field7]) AND
((Catalog.Field8)=[Query16].[Field8]) AND
((Catalog.Field9)=[Query16].[Field9]) AND
((Catalog.Field10)=[Query16].[Field10]) AND
((Catalog.Field13)=[Query16].[Field13]) AND
((Catalog.ID)<>[Query16].[FirstOfID]));

Can anyone tell me why this Query will not run? Any help appreciated.
 
J

John Spencer

Try the following. WARNING Backup your data BEFORE you use this. If it
does not work the way you expect, you cannot recover

DELETE
FROM Catalog
WHERE Catalog.ID in
(SELECT LAST(Catalog.ID)
FROM Catalog
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

If you have cases where there are three (or more) records duplicated,
you will have to rerun the query to continue deleting the duplicates.

You can try the following if you have three or more duplicates based on
the field combination. The problem is that "NOT in" with a subquery
tends to be very slow for large sets of records.

DELETE
FROM Catalog
WHERE Catalog.ID NOT in
(SELECT FIRST(Catalog.ID)
FROM Catalog
WHERE ID is NOT NULL
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================
 
N

NEWER USER

I get a Syntax error. All is highlighted from Select to ); the left
parenthesis and all before is not highlighted

John Spencer said:
Try the following. WARNING Backup your data BEFORE you use this. If it
does not work the way you expect, you cannot recover

DELETE
FROM Catalog
WHERE Catalog.ID in
(SELECT LAST(Catalog.ID)
FROM Catalog
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

If you have cases where there are three (or more) records duplicated,
you will have to rerun the query to continue deleting the duplicates.

You can try the following if you have three or more duplicates based on
the field combination. The problem is that "NOT in" with a subquery
tends to be very slow for large sets of records.

DELETE
FROM Catalog
WHERE Catalog.ID NOT in
(SELECT FIRST(Catalog.ID)
FROM Catalog
WHERE ID is NOT NULL
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================


NEWER said:
I have a table with 3.2 million records. In this table are 1000 duplicates
that I found with the Find Duplicates Query in Access 2003. I want to remove
500 and leave the other 500 in tact. So, I created a Totals Query and grouped
on six fields and added another field which is an ID field(autonumber/primary
key) to get a unique field in each record and set it to First. The query
returned expected results. I saved this query as "Query16".

I then created a Delete Query and pulled down the "*" fields to the grid
from the table "Catalog" containing the records I want to delete. I also
added all seven fields from the Catalog table to the grid and set the
Criteria to the Totals query. I linked the tables on Field1.

When I View the results, i get the 500 records that I want to delete.
However, when I try to run the actual Query, I get the Error Message 3086.
Could not delete from specified table. It is not opened as read-only.

DELETE Catalog.*, Catalog.Field1, Catalog.Field7, Catalog.Field8,
Catalog.Field9, Catalog.Field10, Catalog.Field13, Catalog.ID
FROM [Catalog] INNER JOIN Query16 ON Catalog.Field1 = Query16.Field1
WHERE (((Catalog.Field1)=[Query16].[Field1]) AND
((Catalog.Field7)=[Query16].[Field7]) AND
((Catalog.Field8)=[Query16].[Field8]) AND
((Catalog.Field9)=[Query16].[Field9]) AND
((Catalog.Field10)=[Query16].[Field10]) AND
((Catalog.Field13)=[Query16].[Field13]) AND
((Catalog.ID)<>[Query16].[FirstOfID]));

Can anyone tell me why this Query will not run? Any help appreciated.
 
J

John Spencer

Which version did you use?

I do not see any syntax error in what I posted.

Did you use what I posted as the entire query?

All I can say is post the SQL statement that is failing. Perhaps
someone can spot what is causing the error.

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================


NEWER said:
I get a Syntax error. All is highlighted from Select to ); the left
parenthesis and all before is not highlighted

John Spencer said:
Try the following. WARNING Backup your data BEFORE you use this. If it
does not work the way you expect, you cannot recover

DELETE
FROM Catalog
WHERE Catalog.ID in
(SELECT LAST(Catalog.ID)
FROM Catalog
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

If you have cases where there are three (or more) records duplicated,
you will have to rerun the query to continue deleting the duplicates.

You can try the following if you have three or more duplicates based on
the field combination. The problem is that "NOT in" with a subquery
tends to be very slow for large sets of records.

DELETE
FROM Catalog
WHERE Catalog.ID NOT in
(SELECT FIRST(Catalog.ID)
FROM Catalog
WHERE ID is NOT NULL
GROUP BY Field1, Field7, Field8, Field9, Field10, Field 13
HAVING Count(*)>1)

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================


NEWER said:
I have a table with 3.2 million records. In this table are 1000 duplicates
that I found with the Find Duplicates Query in Access 2003. I want to remove
500 and leave the other 500 in tact. So, I created a Totals Query and grouped
on six fields and added another field which is an ID field(autonumber/primary
key) to get a unique field in each record and set it to First. The query
returned expected results. I saved this query as "Query16".

I then created a Delete Query and pulled down the "*" fields to the grid
from the table "Catalog" containing the records I want to delete. I also
added all seven fields from the Catalog table to the grid and set the
Criteria to the Totals query. I linked the tables on Field1.

When I View the results, i get the 500 records that I want to delete.
However, when I try to run the actual Query, I get the Error Message 3086.
Could not delete from specified table. It is not opened as read-only.

DELETE Catalog.*, Catalog.Field1, Catalog.Field7, Catalog.Field8,
Catalog.Field9, Catalog.Field10, Catalog.Field13, Catalog.ID
FROM [Catalog] INNER JOIN Query16 ON Catalog.Field1 = Query16.Field1
WHERE (((Catalog.Field1)=[Query16].[Field1]) AND
((Catalog.Field7)=[Query16].[Field7]) AND
((Catalog.Field8)=[Query16].[Field8]) AND
((Catalog.Field9)=[Query16].[Field9]) AND
((Catalog.Field10)=[Query16].[Field10]) AND
((Catalog.Field13)=[Query16].[Field13]) AND
((Catalog.ID)<>[Query16].[FirstOfID]));

Can anyone tell me why this Query will not run? Any help appreciated.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top