Compare City and State to Master List

G

Guest

Hello and thanks in advance!

In a single Excel sheet: I have a column (A) of city names and a column (B)
of state names (about 3100) generated from an internal company db. I need to
check for the "correct" spelling of these city and state names by comparing
each row in these columns to a master list of city and state names (about
29000) located in columns D and E, respectively, a list from a third party
resource.

SOOO, I believe I need a formula and/or logic to first see for each row if
the city and state in A and B match ANY city and state in columns D and E. If
so, no action should be taken or at most a Y indicator put in column C. If
not, then I would like the formula or logic to check the city and state name
from A and B to those found in D and E for the closest match and then suggest
that closest match in column C in that same row. It needs to take into
consideration two word city names (and spaces) and the state names need to
match.

Thanks!
 
G

Guest

Take a look at the VLOOKUP function. You will need to concatenate your
city/state into one helper column on one of the sheets for this to work.
To concatenate use =A1 & A2
 
G

Guest

Hi NE,

First, find the perfect matches:

1) In E2, and copy down for all rows of your company’s data:
=A2&B2
2) In F2, and copy down for all rows of “correct†data:
=C2&D2
3) In G2, and copy down for all rows of your company’s data:
=IF(ISERROR(VLOOKUP(E2,$F:$F,1,FALSE))=TRUE,“no matchâ€,â€matchâ€))

Second, to find the closest matches:

1) Select G1, left click on Data/Filter/Autofilter
2) From the down arrow in G1, select “no matchâ€
3) Alt-F11
4) Insert / Module
5) Paste the code from “Option Explicit†through “End Function†below into
the code window:

Option Explicit
Type RankInfo
Offset As Integer
Percentage As Single
End Type
Function FuzzyVLookup(ByVal LookupValue As String, _
ByVal TableArray As Range, _
ByVal IndexNum As Integer, _
Optional NFPercent As Single = 0.05, _
Optional Rank As Integer = 1, _
Optional Algorithm As Integer = 3, _
Optional AdditionalCols As Integer = 0) As Variant
Dim R As Range
Dim strListString As String
Dim strWork As String
Dim sngMinPercent As Single
Dim sngWork As Single
Dim sngCurPercent As Single
Dim intBestMatchPtr As Integer
Dim intRankPtr As Integer
Dim intRankPtr1 As Integer
Dim I As Integer
Dim lEndRow As Long
Dim udRankData() As RankInfo
Dim vCurValue As Variant
LookupValue = LCase$(Application.Trim(LookupValue))
If IsMissing(NFPercent) Then
sngMinPercent = 0.05
Else
If (NFPercent <= 0) Or (NFPercent > 1) Then
FuzzyVLookup = "*** 'NFPercent' must be a percentage > zero ***"
Exit Function
End If
sngMinPercent = NFPercent
End If

If Rank < 1 Then
FuzzyVLookup = "*** 'Rank' must be an integer > 0 ***"
Exit Function
End If

ReDim udRankData(1 To Rank)

lEndRow = TableArray.Rows.Count
If VarType(TableArray.Cells(lEndRow, 1).Value) = vbEmpty Then
lEndRow = TableArray.Cells(lEndRow, 1).End(xlUp).Row
End If

For Each R In Range(TableArray.Cells(1, 1), TableArray.Cells(lEndRow, 1))
vCurValue = ""
For I = 0 To AdditionalCols
vCurValue = vCurValue & R.Offset(0, I).Text
Next I
If VarType(vCurValue) = vbString Then
strListString = LCase$(Application.Trim(vCurValue))
sngCurPercent = FuzzyPercent(String1:=LookupValue, _
String2:=strListString, _
Algorithm:=Algorithm, _
Normalised:=True)

If sngCurPercent >= sngMinPercent Then
'---------------------------
'-- Store in ranked array --
'---------------------------
For intRankPtr = 1 To Rank
If sngCurPercent > udRankData(intRankPtr).Percentage Then
For intRankPtr1 = Rank To intRankPtr + 1 Step -1
With udRankData(intRankPtr1)
.Offset = udRankData(intRankPtr1 - 1).Offset
.Percentage = udRankData(intRankPtr1 -
1).Percentage
End With
Next intRankPtr1
With udRankData(intRankPtr)
.Offset = R.Row
.Percentage = sngCurPercent
End With
Exit For
End If
Next intRankPtr
End If

End If
Next R

If udRankData(Rank).Percentage < sngMinPercent Then
FuzzyVLookup = CVErr(xlErrNA)
Else
intBestMatchPtr = udRankData(Rank).Offset - TableArray.Cells(1, 1).Row +
1
If IndexNum > 0 Then
FuzzyVLookup = TableArray.Cells(intBestMatchPtr, IndexNum)
Else
FuzzyVLookup = intBestMatchPtr
End If
End If
End Function



6) In H1: 1
7) In I1: 2
8) In J1: 3
9) Into the first cell in column H below row 1 (still filtered on column G).
Use the current row number to replace the # symbol in the formula below:
=fuzzyvlookup($E#,$F:$F,1,,,H$1)

10) While still filtered on column G: copy the formula into columns I and J;
for all rows of your company’s data in columns H, I, and J where the city and
state are “no matchâ€

Notes:

The custom function cannot have empty cells in the middle of the lookup
data, an empty cell is the test that will stop the search process.

Fuzzy matching algorithms are very CPU hungry, so try to use them sparingly.

Let me know if this helps.

Thanks,
Peggy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top