Identifying complex duplicates




I have a spreadsheet with > 40,000 rows. An abbreviated example:
Date RecNum Code Amt Secondary
1/1/2005 453654 44324 550.54 no
12/24/2004 546676 54161 65.66 no
12/24/2004 546676 44970 900.44 no
12/24/2004 546676 43830 no
7/2/2003 54161 54640 773.21 no

The problem:
I need to identify dates where the RecNum occurs more than once. All
but one occurrence on the same date should have the secondary column as
'yes'. (In other words, if one RecNum (=person) has several codes on
the same day, one of the codes is primary, and all the others are
secondary. Ideally, I would look at amount to decide which is the
primary code and which are the secondary (biggest amount = primary).
Sometimes, no amount is given (ie, the data is crappy).

I assume I need to create a new column of date + RecNum (?), then check
for duplicates, and use an if..then conditional to compare amounts...

Any guidance would be appreciated....



Without supplying an exact solution, consider the following approach a
a set of macros:

1) Sort entire table as descending amount
2) turn on autofilters for all columns
3) start with the first data row, moving down towards the last row o
each iteration
4) extract the date value on the current row.
5) turn on a filter for that date only
6) set the first result row as yes, the others as no
7) undo the autofilter selection by date
8) increment your selection row, go back to 4)

Of course, add in needed error checking etc to deal with faulty data



This assumes data (columns A to E) is sorted by DATE (Ascending),
RECNUM(Ascending) and AMT (Descending). I have done limited testing!


Sub FindDuplicates()

Dim DateRng As Range
Dim lastrow As Long, r As Long
Dim nrec As Long, nrecnum As Long, ndate As Long

lastrow = Cells(Rows.Count, 1).End(xlUp).Row
Set DateRng = Range("a2:a" & lastrow)

r = 2
Compdate = Range("a" & r).Value
ndate = Application.CountIf(DateRng, Compdate)
If ndate > 1 Then
nrec = 0
nrecnum = Application.CountIf(Range(Cells(r, 2), Cells(ndate + r - 1,
2)), Cells(r, 2))
Cells(r, 5) = "Yes"
If nrecnum > 1 Then
Cells(r + 1, 5).Resize(nrecnum - 1, 1) = "No"
End If
r = r + nrecnum
nrec = nrec + nrecnum
Loop While nrec < ndate
Cells(r, 5) = "Yes"
r = r + 1
End If

Compdate = Range("a" & r).Value

Loop While r <= lastrow
End Sub

Bob Phillips

Does this do what you want?

In row 2


and copy down



(remove nothere from the email address if mailing direct)

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question