Delete duplicates with conditions

D

Damian

Hi, I have a table with FieldA, FieldB, FieldC.

I want to delete records based on FieldA, leaving me with one unique record.
The corresponding data in FieldB and FieldC is sometimes NULL - sometimes not.

How do I DELETE duplicate records based on FIELDA where FieldB and FieldC
contain the most amount of data (or contain the least NULL values) i.e.

Field1, Field2, Field3
A , data , null DELETE
A , data , data KEEP
A , data , null DELETE
B , null , null DELETE
B , null , data KEEP
B , null , null DELETE
C , null , null DELETE
C , null , data DELETE
C , data , data KEEP

Cheers, Damian
 
A

Allen Browne

Firstly, your table will need a primary key, so you can distinguish between
duplicates. Otherwise the concept of which record to keep is undefined.

Next, the general approach is to use a subquery to identify the duplicates.
Here's an example:
http://allenbrowne.com/subquery-01.html#DeDuplicate

Now comes the question of how to identify the most desirable record, i.e.
the one that has the most data. The expression:
(Field2 Is Not Null)
returns True if the field has a value.
True is -1 in Access.
If you sum that value with the expression (Field3 Is Not Null), you will
get:
-2 if both have values
-1 if either one has a value
0 if neither has a value.
If you then sort by that expression, the -2 is sorted first, and so that's
the one you want to keep. You will therefore identify the desirable record
with:
ORDER BY ((Field2 Is Not Null) + (Field3 Is Not Null))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top