Record Comparison Between Two Databases

S

Steve C

I am writing a procedure to compare thousands of contact records from two
databases for any differences that exist between duplicate contacts in 17
different fields (Company, Address, phone number, email, etc.). Since both
databases are updated frequently and independently (that's unavoidable), we
need to compare the two on a monthly basis so that both contain the most
updated information as possible.

My approach has been to export all the contacts into one spreadsheet and
sort them by DisplayName (which includes first and last name). My procedure
first compares a DisplayName to the record below it. If it is listed only
once, obviously no duplicate exists and that record's entire row is removed
and pasted elsewhere. If the DisplayName is listed two or more times, the
fields of the first record are used as the basis of comparision to determine
if the other records are different. Differences are highlighted in yellow.

My question has to do with finding the most efficient way (via VBA) to
compare duplicate records. As an example, let's say that Joe Smith is listed
four times. My code currently concatenates all 17 fields for each record
into one string that can be quickly compared to each other. If the strings
for each are identical, there are no differences and those records are
removed. However, if the strings for two or more records are not the same,
then my code starts with the first of the 17 fields (Company) and compares
all the values for each Joe Smith record, highlighting differences that exist
from the first record. It repeats this process for all the fields and then
continues on for all other contacts.

If anyone has a better, more efficient suggestion for improving the
comparison process, I would appreciate hearing from you! Thanks.
 
J

Joel

My first recomendation no matter which methoid is used is to mark the rows to
delete and then delete all the rows at one time. I usually put an Z in the
auxilary column When I find a duplicate. Then sort descending order the
auxilary column with an X that will bring all the X's to the top of the
worksheet.

I like finding duplicates by using a worksheet formula that can put
automatically placed on the worksheet

Put in auxilary column and the copy down all the cells.
=if(COUNTIF(A$1:A1,A1)>1,"X","")

Try this code. I had the first row of the worksheet the names without a
header row. The names were in column A and the auxilary row is incolumn D.
The code may seem a little complicated but runs very quickly.


Sub RemoveDuplicates()
'Use column D as Auxilary column to place X on duplicate rows
With Sheets("Sheet1")
LastRow = .Range("A" & Rows.Count).End(xlUp).Row
'put formula in auxilary row D
.Range("D1").Formula = "=if(COUNTIF(A$1:A1,A1)>1,""X"","""")"
.Range("D1").Copy Destination:=.Range("D1:D" & LastRow)

'replace formula with value
Range("A1:D" & LastRow).Copy
Range("A1:D" & LastRow).PasteSpecial xlPasteValues


Range("A1:D" & LastRow).Sort _
Key1:=Range("D1"), _
Order1:=xlDescending, _
Header:=xlGuess

CountX = WorksheetFunction.CountIf( _
.Range("C1:D" & LastRow), "X")
.Rows("1:" & CountX).Delete
End With
End Sub
 
S

Steve C

Thanks, Joel. Your suggestion to process as many records at one time is a
good one. Thanks for the formula and the code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top