Deleting Duplicates, All records unique

  • Thread starter Thread starter mirdonamy
  • Start date Start date
M

mirdonamy

My column headers are: id, filename, location, and description.

All descriptions are unique.
My filename column has duplicates. For example, flower010104.jpg is
listed twice, with two different descriptions. I want to delete BOTH
rows containing flower010104.jpg.

So, I want to delete ROWS with duplicate filenames, regardless of the
description being unique (which makes the 'record' unique).

I have found that I can only filter by 'unique record', but ALL records
are unique, due to the description.

I need help. How can I do accomplish my task?
 
mirdonamy,

Use another column with a formula like this in row2:

=COUNTIF(B:B,B2)>1

Where column B has your filenames. Then copy down to match your data table, then filter or sort
based on that column, and delete rows where the value of your formula is TRUE.

HTH,
Bernie
MS Excel MVP
 
That's a pretty impressive formula, but here's the odd thing... TRUE
only brought up 22 records (all duplicate filenames, just as I wanted).
However, it didn't bring up the other 700+ records that have duplicate
filenames. I can't quite understand why this happened.

Just a note, these filenames have a row filled in completely (all the
way across) and the duplicates do not have any information filled out
in other columns (other than the filename) column. Does this affect
the formula?
 
Here's another fairly quick way. I assume your data is not sorted by
filename and I presume you want to keep the sequence you have at the
moment. Assume your four fields occupy columns A to D, and that the
data starts in row 2 (after the headings) and goes down to row 5000.

Add the heading "seq" in column E and in E2 enter 1. Highlight cells E2
to E5000 then Edit | Fill | Series and check Linear with a step value
of 1. Click OK - this will fill a sequence down this column to enable
you to get the data back into the same order.

Highlight A1 to E5000 and sort the data using filename (column B). Add
the heading "Check" in column F, and in cell F2 enter the following
formula:

=IF(OR(B2=B1,B2=B3),"duplicate","unique")

Copy this down to F5000 (double-click the fill handle). Select Data |
Filter | Autofilter (on). Filter column F for "duplicate". Highlight
all visible rows between Row 1 and Row 5001, and Edit | Delete Row. Use
the filter pull-down on column F to select "All", then Data | Filter |
Autofilter (off).

Re-sort the remaining data using column E (seq) for sort order.
Finally, delete columns E and F.

Hope this helps.

Pete
 
Note, you will get some #REF in column F after you have deleted the
rows, but this does not matter.

Pete
 
You are brilliant!!! Thank you so much! You saved my day and gave me
back hours of my life! Thank you thank you!

I am so appreciative!
Arielle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top