Duplicates in excel that aren't 100% DUPLICATES ...

  • Thread starter Thread starter jeisdorfer
  • Start date Start date
J

jeisdorfer

I'm having a merging nightmare in excel ...

I'm trying to get rid of duplicates of names and address from a list
of about 6,000 however, the names and addresses are not uniformly
formatted (i.e. some have Apt. # in column Address 1 whereas some are
in column Address 2, some have AVE vs. Avenue, some names are Mr. and
Mrs. LastName others are FirstName and FirstName LastName. etc. etc.
the mess goes on).

Is there a formula I could put in a column next to the address line 1
just the street number followed by the first word? i.e. "111
Westbourne" from "111 Westbourne Terr. #3" I think that would help me
sort through the mess.

Is this the easiest way to go? Baaah....
 
You might try Data - Text To Columns and specify a space as the delimiter.
Look this up in Help. This will parse an entry into several columns with
one word/number in each column. You could then concatenate any 2 or more
columns to get what you want. HTH Otto
 
The best way is to use two new auxilary columns. The two functions will look
for the first space and put the data to the left of the space (the street
number) in one cell and the string to the left of the space in a second
column.

Auxilary Address column
=LEFT(G17,FIND(" ",G17)-1)

the FIND(" ",G17) gets the position of the frist space. Subtracting 1 from
the space gets the length of the street number in characters. Then the LEFT
extracts the street number. G17 is the cell where the street name and number
are located.


street name column

=MID(G17,FIND(" ",G17)+1,LEN(G17))

Mid works the same as left put extracts the middle of a string. We use the
same FIND command but add one to get the start of the street name. Mid also
requies a number of characters which has to be as long as the street name but
can be longer. I just use the entire string length to make the code simple.
 
So that ALMOST works, except, for the street name column -- I only
want the name of the street because the end of the string varies (i.e.
AVE, Avenue, Apt #s, etc.). Is there a way to select the words IN
BETWEEN the first and second space? Hmm...

Then I'll add the Street #, Street Name, and ZIP columns together and
find duplicates from there ...
 
Of course, Of course.


Street_Num =LEFT(G17,FIND(" ",G17)-1)
First_Word =MID(G17,FIND(" ",G17)+1,LEN(G17))
Second_Word =MID(First_Word,FIND(" ",First_Word)+1,LEN(First_Word))
'remove the 2nd word from the 1st word
First_Word = LEFT(First_Word ,FIND(" ",First_Word )-1)
 
Back
Top