Code Speed Up

lists · Aug 9, 2004

I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

Can this be sped up?

I'm thinking that flagging all and then removing might speed things
up, but I'm not sure about this scenario.

Another chunk which actually appears to be quicker than the above,
which I modified to remove both duplicates and originals (originally
provided by Patrick Molloy) is:

Sub RemoveDupesAndOriginals()
Remove_Dupes 3
End Sub

Sub Remove_Dupes(testcol As Long)
Dim Col As Long
Dim lastrow As Long
Dim thisrow As Long
Dim lastrow2 As Long
Dim thisrow2 As Long

On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

' get the last column, then add the row numbers
Col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row
lastrow2 = lastrow
' add a column fro the original row order
With Range(Cells(1, Col), Cells(lastrow, Col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, Col))
.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1
If Cells(thisrow, testcol).Value = Cells(thisrow - 1,
testcol).Value Then
Cells(thisrow - 1, testcol + 2).Value = 1
Cells(thisrow, testcol + 2).Value = 1
Rows(thisrow).Delete
End If
Next
'Delete the originals which had duplicates
For thisrow2 = lastrow2 To 2 Step -1
If Cells(thisrow2, testcol + 2).Value = 1 Then
Rows(thisrow2).Delete
End If
Next

If Cells(1, testcol + 2).Value = 1 Then
Rows(1).Delete
End If
'restore whats left to the original order
.Sort Cells(1, Col)

End With

EndMacro:

Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic

End Sub

Tom Ogilvy · Aug 9, 2004

assume this can be determined by looking at the values in column A

Sub DeleteDups()
Dim rng As Range
Columns(2).Insert
Set rng = Range(Cells(1, 1), _
Cells(Rows.Count, 1).End(xlUp))
rng.Offset(0, 1).Formula = _
"=if(countif($A$1:A1,A1)>1,na(),false)"
rng.Offset(0, 1).SpecialCells(xlFormulas, _
xlErrors).EntireRow.Delete
Columns(2).Delete
End Sub

Alan Beban · Aug 9, 2004

lists said:
I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

Can this be sped up?

Generally you can get a dramatic improvement by transferring the data
from the range(s) to array(s) and looping in the arrays, then returning
the data from the array(s) to the worksheet. This would require
mounting the problem of deleting rows of an array, or flagging the rows
targeted for deletion and deleting them after returning the array data
to the worksheet. This may seem a bit daunting, but the factor of
improvement in speed of execution by looping in the array(s) is often
several hundredfold.

Alan Beban

bruce · Aug 10, 2004

-----Original Message-----
I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

You could try this (uses a native Excel function for the
speed).

Sub unique_values()
'Creates a sorted list of unique values starting at Target
'Rev A 27/5/2003

'PRELIMINARIES
Dim Examine As String, Target As String, ThisPrompt As
String, title As String
Dim UserRng_A As Range, UserRng_B As Range
Dim valu As Variant

'STEP 1 DETERMINE WHERE THE RAW DATA IS
ThisPrompt = "Where is the top of the VALUES to test ? eg
A3 or B5"
title = "UNIQUE VALUES (Rev A)"
On Error Resume Next ' in case a range does not get
selected
'The use of the "Set" statement assigns the output to the
selected ActiveCell
Set UserRng_A = Application.InputBox(prompt:=ThisPrompt,
title:=title, _
Default:=ActiveCell.Address, Type:=8) '"Type 8" means a
Range result.
If UserRng_A Is Nothing Then 'input was box cancelled
MsgBox "Cancelled"
Exit Sub ' Rev A
End If

'STEP 2 DETERMINE WHERE TO PUT THE LIST
ThisPrompt = "Where is the Data to be put ?" _
& Chr(13) & Chr(13) & "You will need blank cells under
the it."
Set UserRng_B = Application.InputBox(prompt:=ThisPrompt,
title:="Select a cell", _
Default:=ActiveCell.Address, Type:=8)
If UserRng_B Is Nothing Then
MsgBox "Cancelled"
Exit Sub ' Rev A
End If
Target = UserRng_B.Address() 'the address of the selected
cell

'STEP 3 GATHER BASIC DATA
Application.ScreenUpdating = False
UserRng_A(0, 1).Select 'select the cell above
Examine = Selection.Address() 'the address of the cell
above
valu = Selection.Formula 'store the contents of the cell
one row above the first data
UserRng_A(0, 1).Formula = "temporary string" 'THE
ADVANCED FILTER DEMANDS A STRING IN THIS CELL

'STEP 4 CREATE THE UNIQUE ENTRIES
Range(Target).Clear 'needed to stop filtering falling over
Range(Examine).Activate 'filter then insert unique values
starting at Target
Range(Examine, ActiveCell.End(xlDown)).AdvancedFilter
Action:=xlFilterCopy, _
CopyToRange:=Range(Target), Unique:=True
'now sort the values
Range(Target).Select 'musn't remove this line
Range(Target, ActiveCell.End(xlDown)).Select
Selection.Sort Key1:=Range(Target), Order1:=xlAscending,
Header:=xlYes, _
OrderCustom:=1

'STEP 5 TIDY UP
UserRng_B.Formula = ""
Range(Examine).Formula = valu 'restore the original entry
to this cell
Application.ScreenUpdating = True

End Sub

Code Speed Up

lists

Tom Ogilvy

Alan Beban

bruce