Delete rows with duplicate information

  • Thread starter Thread starter Ixtreme
  • Start date Start date
I

Ixtreme

I have a sheet on which I import data from an external source. After
importing data, I need something to remove complete rows that match a
specific criteria.

For example:

I have in Column D a Date and in column E an EmployeeName and in
column F and OrderNumber.

after my import, I need some vba that runs through my data quickly to
remove all existing rows that have the same Date, the same
EmployeeName and an empty OrderNumber.

For example
D E F
row 50) 17-08-2007 Mark 12345
row 60) 17-08-2007 Mark
row 62) 17-08-2007 John

In this case, only row 60 should be removed

Thank you very much
 
Sub Removed_Duplicates()

LastRow = Cells(Rows.Count, "D").End(xlUp).Row

Remove = False
LoopCounter = 1
Do While LoopCounter <= LastRow
If IsEmpty(Cells(LoopCounter, "F")) Then

MyDate = Cells(LoopCounter, "D").Value
Employee = Cells(LoopCounter, "E").Value

For RowCount = 1 To LastRow
If RowCount <> LoopCounter Then

If (Cells(RowCount, "D").Value = MyDate) And _
(Cells(RowCount, "E").Value = Employee) Then

Remove = True
End If

End If

Next RowCount

End If

If Remove = True Then
Rows(LoopCounter).Delete
Remove = False
Else
LoopCounter = LoopCounter + 1
End If
Loop
End Sub
 
Thanks Joel,

the code wors fine, however it takes more than 5 minutes to complete
this code on a 4,000 rows sheet. Is there a possibility to speed
things up?

Is it possible to do only the deleting in the selected rows for
instance? That would already help me a lot.
 
I made a minor correction that will probably speed up the time by 1/2. I
think this is the best that I can do. I added an Exit for statement that
will help.


Sub Removed_Duplicates()

LastRow = Cells(Rows.Count, "D").End(xlUp).Row

Remove = False
LoopCounter = inputbox("Enter Row to Start Deleteing : ")
Do While LoopCounter <= LastRow
If IsEmpty(Cells(LoopCounter, "F")) Then

MyDate = Cells(LoopCounter, "D").Value
Employee = Cells(LoopCounter, "E").Value

For RowCount = 1 To LastRow
If RowCount <> LoopCounter Then

If (Cells(RowCount, "D").Value = MyDate) And _
(Cells(RowCount, "E").Value = Employee) Then

Remove = True
exit for
End If

End If

Next RowCount

End If

If Remove = True Then
Rows(LoopCounter).Delete
Remove = False
Else
LoopCounter = LoopCounter + 1
End If
Loop
End Sub
 
Joel,

Thanks again, but it is still not fast enough. Would it be possible to
do the deleteing only on the selected rows that are displayed after
using criteria in an autofilter?

I have an autofilter with column B displaying the specific weeknumber
of the date in column D. If the user selects for example only to
display week 33, would it be possible to run the Macro
Removed_Duplicates only for the selected rows of week 33?

Thanks for your help!

Mark
 
I'm am curious if you ran the macro a 2nd time after the duplicates are
removed how fast does the macro run. The lenght of time the program takes to
run may just be a problem the first time you run the code. Onces most of the
duplicates are run the code should run quicker.

I have taken a Graduate Computer Science course on algorithms whre a major
portion of the course dealt with calculating speed of different algorithms.
I also know different methods of programming in VBA and which methods execute
faster than others.

I want to help solve your problem. The real problem with this code is doing
the delete. The faster methods won't work because the delete screws up the
way excel counts row numbers. Rows get skipped if the faster method is used.

I can change the code to check only certain row numbers. Pretty easy. But
it may result that duplicates in other rows may not get deleted.

If you only add new rows of data to previous list, I think the best method
is to leave the code alone. The code will take a long time to run the first
time you use it. But after most of the rows with no order numbers are
removed, the code speeds up.

The code checks only the rows with no order numbers. If you have 1000 rows
with now orde numbers the code may take 5 minutes to run. After the
duplicates are removed you may have only 20 rows with no order numbers. The
code should then run 1000/20 = 50 time faster. Five minutes = 600 seconds
/50 = 12 seconds. 12 seconds seems a reasonable amount of time to make sure
all your duplicates are removed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top