split string based on a SET of words

M

MHoenders

hi,

i have a string containing an address/town/zip code and want to split
them in seperate columns.
example:
123 ocean rd smalltown 2134
into:
123 ocean rd | smalltown | 2134

i don't know to do the first split as its position varies but the
street always ends with rd, st, ave. can that be of any help? can
someone help me?

regards
mordy
 
P

Peo Sjoblom

If there are only these three options and if they have a space before and
after

=LEFT(A1,FIND("^^^",SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(LOWER(A1)," rd
","^^^")," st ","^^^")," ave ","^^^"))+3)


will return the first string, then assuming that you put this formula in
let's say B1

=TRIM(SUBSTITUTE(A1,B1,""))
 
P

Peo Sjoblom

Oops, didn't see that you wanted to split it in 3 places

to get the last part use

MID(A1,FIND("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))))+1,1024)


Put that in D1 now for the second string in C1 with the first in B1 and the
last in D1 use

=TRIM(SUBSTITUTE(SUBSTITUTE(A1,B1,""),D1,""))

I haven't really tested it a lot but should work for most cases given only 3
options (what about blvd, Ln, Dr etc?),
 
R

Ron Rosenfeld

hi,

i have a string containing an address/town/zip code and want to split
them in seperate columns.
example:
123 ocean rd smalltown 2134
into:
123 ocean rd | smalltown | 2134

i don't know to do the first split as its position varies but the
street always ends with rd, st, ave. can that be of any help? can
someone help me?

regards
mordy

I would use UDF's involving regular expressions. Then you can add as many
street type abbreviations as you wish.

In the functions below, StrAbbrev is a named cell containing a pipe-delimited
list of your desired street name abbreviations. For example:

rd|st|av|ave|wy|dr|ln

Do NOT include the leading or trailing <space> as that is accounted for in the
basic function. Also accounted for, in the basic function, is an optional dot
(.).

Street Address:

=resub(A1,"(.*"&StrAbbrev&".?)\s.*","$1")

City:

=resub(A1,".*\s("&StrAbbrev&")\.?\s(.*)\s\d+-?\d+$","$2")

Zip:

=resub(A1,".*\s(\d+-?\d+$)","$1")

Since you showed a 4 digit zip code in your example, and since that is not
standard for the US, the zip code part of the regex may need to be tweaked. As
written, it consists of any series of digits that may or may not include a dash
(-), begins with a <space> and ends at the end of the line.

To use these UDF's you must enter them into a regular VBA module.

<alt-F11> opens the VB Editor.

Ensure your project is highlighted in the Project Explorer Window.

Insert/Module and paste the code below into the window that opens.

========================================
Option Explicit
Function RESub(str As String, SrchFor As String, ReplWith As String) As String
Dim objRegExp As RegExp

Set objRegExp = New RegExp
objRegExp.Pattern = SrchFor
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.MultiLine = True

RESub = objRegExp.Replace(str, ReplWith)

End Function
===============================================
--ron
 
P

Peo Sjoblom

I would definitely go with Ron's suggestion, Excel is not very good at
parsing regardless if you parsing names or addresses given all the options
 
R

Ron Rosenfeld

I would use UDF's involving regular expressions. Then you can add as many
street type abbreviations as you wish.

In the functions below, StrAbbrev is a named cell containing a pipe-delimited
list of your desired street name abbreviations. For example:

rd|st|av|ave|wy|dr|ln

Do NOT include the leading or trailing <space> as that is accounted for in the
basic function. Also accounted for, in the basic function, is an optional dot
(.).

Street Address:

=resub(A1,"(.*"&StrAbbrev&".?)\s.*","$1")

City:

=resub(A1,".*\s("&StrAbbrev&")\.?\s(.*)\s\d+-?\d+$","$2")

Zip:

=resub(A1,".*\s(\d+-?\d+$)","$1")

Since you showed a 4 digit zip code in your example, and since that is not
standard for the US, the zip code part of the regex may need to be tweaked. As
written, it consists of any series of digits that may or may not include a dash
(-), begins with a <space> and ends at the end of the line.

To use these UDF's you must enter them into a regular VBA module.

<alt-F11> opens the VB Editor.

Ensure your project is highlighted in the Project Explorer Window.

Insert/Module and paste the code below into the window that opens.

========================================
Option Explicit
Function RESub(str As String, SrchFor As String, ReplWith As String) As String
Dim objRegExp As RegExp

Set objRegExp = New RegExp
objRegExp.Pattern = SrchFor
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.MultiLine = True

RESub = objRegExp.Replace(str, ReplWith)

End Function
===============================================
--ron

I omitted one VERY important instruction in editing the above.

After entering the UDF into the module, select Tools/References and check the
entry for "Mocrosoft VBScript Regular Expressions 5.5"


--ron
 
M

maarten

thanks a lot guys. this works well for me. i added an if loop to test
the regex first so that it returns an empty field if no match is found
instead of the original string.

If objRegExp.Test(str) = True Then
RESub = objRegExp.Replace(str, ReplWith)
Else
RESub = ""
End If

mordy
 
R

Ron Rosenfeld

thanks a lot guys. this works well for me. i added an if loop to test
the regex first so that it returns an empty field if no match is found
instead of the original string.

If objRegExp.Test(str) = True Then
RESub = objRegExp.Replace(str, ReplWith)
Else
RESub = ""
End If

mordy

Glad it works for you. Thanks for the feedback.
--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top