Compare and delete duplicate records

  • Thread starter Thread starter charles
  • Start date Start date
C

charles

We have 2 Excel 2007 spreadsheets of data.

List 1 contains 12,000 records and we wish to remove the contents of list 2
which contains 3,000 records.

Is there a simple way of doing this?
 
Not sure how it is duplicated - does your list of 3000 records contain
records which are duplicated (among themselves)? Or does the shorter
list contain records which are also in the longer list, and you want
to get rid of these duplicates?

You can put a formula in a helper column which looks to see if the
record on that row is contained within the longer list (using MATCH
with a column which has unique values - I can't give you a specific
formula because you have not described your data) and return something
like "remove". Then you can apply autofilter to that column and select
"remove", then highlight all the visible rows and Edit | Delete Row.
When you remove the autofilter you should be left with the non-
duplicate records.

Hope this helps.

Pete
 
Pete

Thanks for coming back to me.

Yes the latter, the list 2 contains 3000 records which we want deleted out
of the 12000 list 1. Does this make more sense?
 
Yes, my second paragraph describes how to do it. You would have
something like:

=IF(ISNA(MATCH(A1,Sheet1!A:A,0)),"not present","duplicate")

in a helper column in Sheet2, and you would copy this down your 3000
rows. This assumes that column A contains the unique identifier in
both lists, and the formula will return "not present" for a unique
record and "duplicate" for duplicated records. As advised earlier, you
can then use autofilter on this helper column to remove the
"duplicate" records.

Hope this helps.

Pete
 
Back
Top