Duplicates in excel that aren't 100% DUPLICATES ...

J

jeisdorfer

I'm having a merging nightmare in excel ...

I'm trying to get rid of duplicates of names and address from a list
of about 6,000 however, the names and addresses are not uniformly
formatted (i.e. some have Apt. # in column Address 1 whereas some are
in column Address 2, some have AVE vs. Avenue, some names are Mr. and
Mrs. LastName others are FirstName and FirstName LastName. etc. etc.
the mess goes on).

Is there a formula I could put in a column next to the address line 1
just the street number followed by the first word? i.e. "111
Westbourne" from "111 Westbourne Terr. #3" I think that would help me
sort through the mess.

Is this the easiest way to go? Baaah....
 
O

Otto Moehrbach

You might try Data - Text To Columns and specify a space as the delimiter.
Look this up in Help. This will parse an entry into several columns with
one word/number in each column. You could then concatenate any 2 or more
columns to get what you want. HTH Otto
 
J

Joel

The best way is to use two new auxilary columns. The two functions will look
for the first space and put the data to the left of the space (the street
number) in one cell and the string to the left of the space in a second
column.

Auxilary Address column
=LEFT(G17,FIND(" ",G17)-1)

the FIND(" ",G17) gets the position of the frist space. Subtracting 1 from
the space gets the length of the street number in characters. Then the LEFT
extracts the street number. G17 is the cell where the street name and number
are located.


street name column

=MID(G17,FIND(" ",G17)+1,LEN(G17))

Mid works the same as left put extracts the middle of a string. We use the
same FIND command but add one to get the start of the street name. Mid also
requies a number of characters which has to be as long as the street name but
can be longer. I just use the entire string length to make the code simple.
 
J

jeisdorfer

So that ALMOST works, except, for the street name column -- I only
want the name of the street because the end of the string varies (i.e.
AVE, Avenue, Apt #s, etc.). Is there a way to select the words IN
BETWEEN the first and second space? Hmm...

Then I'll add the Street #, Street Name, and ZIP columns together and
find duplicates from there ...
 
J

Joel

Of course, Of course.


Street_Num =LEFT(G17,FIND(" ",G17)-1)
First_Word =MID(G17,FIND(" ",G17)+1,LEN(G17))
Second_Word =MID(First_Word,FIND(" ",First_Word)+1,LEN(First_Word))
'remove the 2nd word from the 1st word
First_Word = LEFT(First_Word ,FIND(" ",First_Word )-1)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top