Deleting Duplicate Rows

J

Jbm

Hi,
I checked out the archives for close to an hour, but I couldn't figure out
how to change the codes given there to suit my needs.
I have a large set of data, from about A1 to X441. In column X, there are a
lot of exact duplicates, and I need to delete the rows where those duplicates
are (but still leaving the first instance of the duplicate). For example:

John Smith Oxford St.
John Johnson Oxford St.
John Johnson Rubble St.
John Smith Oxford St.

All of those have things in common, but I only want to delete the final row
(and the whole row, not just the cell), because it is an exact duplicate.
How do I code for this? Excel 2007.
Thanks,
Jbm
 
G

Gord Dibben

Select column X

2003 Data>Filter>Advanced Filter>Uniques only.

Copy to another place.

2007 Data>Remove Duplicates.

Unselect all. Select only column X and remove.


Gord Dibben MS Excel MVP
 
D

Don Guillett

Another way that does not copy elsewhere IF? sorting is allowed. Assumes all
text in ONE cell??
'==
Option Explicit
Sub SortAndDeleteDuplicatesSAS()
Dim mc As Long
Dim i As Long
mc = 1 'column A
Columns(mc).Sort Key1:=Cells(1, mc), Order1:=xlAscending, _
Header:=xlGuess, OrderCustom:=1, MatchCase:=False,Orientation:=xlTopToBottom
For i = Cells(Rows.Count, mc).End(xlUp).Row To 2 Step -1
If Cells(i - 1, mc) = Cells(i, mc) Then Rows(i).Delete
Next i
End Sub
'====
 
S

Steve

Hi,
I checked out the archives for close to an hour, but I couldn't figure
out how to change the codes given there to suit my needs.
I have a large set of data, from about A1 to X441. In column X, there
are a lot of exact duplicates, and I need to delete the rows where
those duplicates are (but still leaving the first instance of the
duplicate). For example:

John Smith Oxford St.
John Johnson Oxford St.
John Johnson Rubble St.
John Smith Oxford St.

All of those have things in common, but I only want to delete the
final row (and the whole row, not just the cell), because it is an
exact duplicate. How do I code for this? Excel 2007.
Thanks,
Jbm

Make a backup first just in case the results are not what you expect.

Excel 2007 select the whole sheet ctrl+a, goto data>data tools>remove
duplicates, in the dialogue box click unselect all and then select row X
click ok this will remove the whole row A-X where X is a duplicate.

Regards
Steve
 
R

ryguy7272

Well, those are not all duplicates, so what is the logic?
John Smith Oxford St. = John Smith Oxford St.
However, John Smith Oxford St. <> John Johnson Oxford St.

Take a look at this:
http://www.rondebruin.nl/easyfilter.htm

Maybe you will have to run through the data a couple times, but that should
do what you want.
 
B

B Lynn B

This assumes Column A is continuously populated from top to bottom of data
set. If not, then pick another column to use for finding last row. You can
also use the UsedRange property of the sheet if necessary.

Sub NoXDups()

Dim TestR As Long
Dim MyStr As String

For TestR = Range("A1").End(xlDown).Row To 1 Step -1
MyStr = Cells(TestR, "X").Value
If Range("X:X").Find(what:=MyStr, After:=Range("X1"), _
LookAt:=xlWhole).Row <> TestR Then
Rows(TestR & ":" & TestR).Delete shift:=xlUp
End If
Next TestR

End Sub
 
J

Jbm

I tried your macro, but it doesn't seem to be working... Maybe the fact that
there are headers is screwing it up? I've been working with your code since
you posted it, but I can't get it to work correctly (Column A has data in
every cell until the bottom of my data set).
 
J

Jbm

Ryguy -- I can't install new software on this machine.

Gord -- it's telling me that it removed 11 duplicates, and 400some unique
values remain. Despite this, all the duplicates I can see are still there
(and I am carefully checking to make sure they are the exact same.... They
are).
 
J

Jbm

Well this is deleting things, but not necessarily duplicates, and oftentimes
cells instead of rows (which means correlated data is getting thrown off).
Not all the data is in one cell, sorting would be allowed as long as the rows
of data each stay together.
 
D

Don Guillett

If desired, send your file to my address below. I will only look if:
1. You send a copy of this message on an inserted sheet
2. You give me the newsgroup and the subject line
3. You send a clear explanation of what you want
4. You send before/after examples and expected results.
 
B

B Lynn B

The headers would not prevent it working unless you had a blank row between
the headers and the data, or no header in column A. Before posting, I tested
this (Excel 2007 Pro) on a block of data with duplicates in column X and it
worked just fine. And it accounts for all the conditions that appeared to
need satisfying. From your sample data, it seems the duplicates may not be
listed consecutively, and you specified the first instance should be the one
left, so re-sorting the data is likely to create problems.

If you've copied this section into a procedure that does other things as
well, perhaps you could post the whole thing to see if there is some other
factor causing it to fail. You don't happen to have any protection on the
sheet do you?
 
D

Don Guillett

If desired, send your file to my address below. I will only look if:
1. You send a copy of this message on an inserted sheet
2. You give me the newsgroup and the subject line
3. You send a clear explanation of what you want
4. You send before/after examples and expected results.
 
S

steve

If your data has headers, then in place of slecting column X in the data
validation dialogue box make sure that the column header is the only one
that is selected.

Regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top