Delete Rows based on number of instances

  • Thread starter Guy Brown via OfficeKB.com
  • Start date
G

Guy Brown via OfficeKB.com

I have a very large amount of data to wade through, and am only really
proficient at using Excel (not access).

I 3 columns of data, and the 3rd column can be repeated row by row. I want
to identify where there are more than 500 (or a definable number) rows with
the same part number, and delete the entire rows. All programs I have seen
so far delete duplictes, I need to maintain duplicates but delete mass
duplication only based on a defined number.

Data will look like this before:

A-2103-0121-0A Y A-2053-7531-01
A-2064-0034-0B Y A-2053-7531-01
A-2033-0702-03 Y M-2033-0973-01
A-2034-0004-03 Y M-2033-0973-01
A-2034-0005-03 Y M-2033-0973-01


If defined number of duplicates to keep was 2 or more say, then only top 2
rows remain. If defined number of duplictes was 3 or more say, then all
data would remain.

HELP?
 
T

Tom Ogilvy

assume you data starts in A2,
assume the number of rows to retain is place in F1

in D2 put in this formula

=if(countif(C:C,C2)>=$F$1,"Delete","Keep")
then drag fill down the column.

Now select all your data and do Data=>filter=>Autofilter

in the dropdown in column D, select Delete.

Now select all the data except the top row (assuming the top row is a header
row) and do
Edit=>Delete
select entire row.

Now do Data=>filter=>Autofilter to turn off the filter. Only the rows you
wanted to keep should remain. (deleting filtered data should only delete
the visible rows).

Now you can delete column D
 
B

Bob Phillips

Guy,

here is one way. Put this code in a standard code module, set the nThreshold
constant to the value you want, and then run it

Sub DeleteRows()
Const nThreshold As Long = 2
Dim cLastRow As Long

Range("D1").EntireRow.Insert
Range("D1").Value = "temp"
cLastRow = Cells(Rows.Count, "A").End(xlUp).Row
Range("D2").FormulaR1C1 = "=COUNTIF(C[-1],RC[-1])>" & nThreshold
Range("D2").AutoFill Destination:=Range("D2:D" & cLastRow)
Range("D1:D" & cLastRow).AutoFilter Field:=1, Criteria1:=True
Range("D1:D" &
cLastRow).SpecialCells(xlCellTypeVisible).EntireRow.Delete
End Sub

--

HTH

RP
(remove nothere from the email address if mailing direct)
 
G

Guy Brown via OfficeKB.com

Thanks for this suggestion - this would do it, and I have already tried it,
unfortunatley the calculating time for 60,000 lines causes excel to hang
up. I was hoping for a macro to get around the problem.
 
G

Guy Brown via OfficeKB.com

Thanks for the code Bob, does this refer to Column C - I cannot seem to get
it to work - do I need to select the range or something?
 
T

Tom Ogilvy

It just does with code what I described (enters the countif formula and
applies an autofilter). So if the manual method doesn't work, it would be
surprising if this worked.
 
G

Guy Brown via OfficeKB.com

I have found some code that does something similar - this was set up to
delete rows when there was less than 4 entries, I guess this should work if
I change the columns to C, and z>400?

Retain if atleast 4 entries 31 Jul 2003 21:25
Peter Atherton

Randy
The following works
Sub testDel()
Dim myRng As Range
Dim c, s, v
Dim count As Integer, nr As Integer, z As Integer
With Worksheets(1)
Set myRng = .Range("A2", .Cells(.Rows.count, "A").End
(xlUp))
End With
nr = myRng.Rows.count
For s = 2 To nr + 1
Cells(s, 1).Select
v = Cells(s, 1).Value
z = Application.CountIf(myRng, v)
If z < 4 Then
Selection.EntireRow.Delete
count = count + 1
End If
Next s
MsgBox count & " Records have been deleted", , "Deleted _
Record Count"
End Sub
 
B

Bob Phillips

This will be a lot (a very lot) slower than the autofilter method on 60,000
rows

--

HTH

RP
(remove nothere from the email address if mailing direct)
 
B

Bob Phillips

Yes, it does refer to column C. It creates a formula in column D and filters
on that and then deletes the visible cells.

--

HTH

RP
(remove nothere from the email address if mailing direct)
 
T

Tom Ogilvy

It is also flawed since it loops forward and doesn't take into account the
fact that rows have been deleted.

Of course Guy's example results are the opposite of his description, so who
knows.
 
B

Bob Phillips

:)



Tom Ogilvy said:
It is also flawed since it loops forward and doesn't take into account the
fact that rows have been deleted.

Of course Guy's example results are the opposite of his description, so who
knows.
 
G

Guy Brown via OfficeKB.com

Tom,

you are absolutely right. I have created my own script which does almost
what I want apart from the fact that the loop, as you say, does not take
into account the deleted rows?

"Private Sub CommandButton1_Click()
Const nThreshold As Long = 400
Dim cLastRow As Long

Dim myRng As Range
Dim C, s, v, vnext
Dim count As Integer, nr As Long, z As Integer, totalcount As Integer,
newcount As Integer
With Worksheets(1)
Set myRng = .Range("C1", .Cells(.Rows.count, "C").End(xlUp))
End With
nr = myRng.Rows.count
count = 0
For s = 1 To nr + 1
Cells(s, 3).Select
v = Cells(s, 3).Value
vnext = Cells(s + 1, 3).Value
If v = vnext Then
count = count + 1
ElseIf v <> vnext Then
If count > 20 Then
Range(Cells(s - count, 3), Cells(s, 3)).Select
Selection.EntireRow.Delete
totalcount = (count + 1) + oldcount
oldcount = totalcount
count = 0

Else
count = 0
End If
End If
Next s

MsgBox totalcount & " Records have been deleted", , "Deleted Record Count"

End Sub"

How can I do this??

otherwise I could "Clear" the rows and then delete all blank rows - any
idea how I can do this?
 
B

Bob Phillips

Guy,

We are confused. If Tom's method was too slow, then my macro, which was
essentially the same as Tom's in VBA form, would also be too slow, but this
method is magnitudes slower. Why do you want to pursue this method, why not
get our suggestion working?

--

HTH

RP
(remove nothere from the email address if mailing direct)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top