Text Separation Function

K

kateconrey

I have a database with one column of information including names (First Last,
or First Middle Last) and for some names, locations in parenthesis. I am
looking for a function that will allow me to split these pieces into separate
columns for First Name, Middle Name (where applicable), Last Name, and
Location (where listed). Does this exist?
 
R

Ron Rosenfeld

I have a database with one column of information including names (First Last,
or First Middle Last) and for some names, locations in parenthesis. I am
looking for a function that will allow me to split these pieces into separate
columns for First Name, Middle Name (where applicable), Last Name, and
Location (where listed). Does this exist?

It may be doable, depending on the patterns. You can search the string for the
patterns which define the various substrings you wish to extract.

You'll have to give some examples that include all the variations of what you
want to do.
--ron
 
R

Ron Rosenfeld

I have a database with one column of information including names (First Last,
or First Middle Last) and for some names, locations in parenthesis. I am
looking for a function that will allow me to split these pieces into separate
columns for First Name, Middle Name (where applicable), Last Name, and
Location (where listed). Does this exist?

It may be doable, depending on the patterns. You can search the string for the
patterns which define the various substrings you wish to extract.

You'll have to give some examples that include all the variations of what you
want to do.
--ron
 
G

Greg Lovern

Hi Kate,

Assuming your data looks like this:

John Smith
John Eric Smith
John Smith (Seattle)
John Eric Smith (Boston)


I would set up some intermediary columns:
(I'm starting at row 2, assumeing headers in row 1)

Column B:
Word Count
=itCOUNTINCELL(A2," ")+1

Column C:
Has Location
=IF(RIGHT(A2,1)=")",TRUE,FALSE)

Column D:
Has Middle
=IF(OR(AND(B2=3,NOT(C2)),B2=4),TRUE,FALSE)

Column E:
Last Word
=IF(OR(AND(B2=2,NOT(C2)),AND(B2=3,C2)),2,3)

Column F:
Loc Word
=IF(C2,B2,FALSE)


Then use those to split out the First, Middle, Last, and Location:


Column H:
First Name
=itGETWORD(A2,1)

Column I:
Middle Name
=IF(D2,itGETWORD(A2,2),"")

Column J:
Last Name
=itGETWORD(A2,E2)

Column K:
Location
=itEXCLUDE(itGETWORD(A2,F2,,,," "),"(",")")


To get itCOUNTINCELL, itGETWORD, and itEXCLUDE, you'll need to
download and install the Free Edition of inspector text:
(it never expires)

http://precisioncalc.com/it


For more information on each of those three functions:

itCOUNTINCELL
http://precisioncalc.com/it/itCOUNTINCELL.html

itGETWORD
http://precisioncalc.com/it/itGETWORD.html

itEXCLUDE
http://precisioncalc.com/it/itEXCLUDE.html


Good luck with your project!


Greg Lovern
mailto:[email protected]
http://PrecisionCalc.com
More Power In Excel
 
G

Greg Lovern

Hi Kate,

Assuming your data looks like this:

John Smith
John Eric Smith
John Smith (Seattle)
John Eric Smith (Boston)


I would set up some intermediary columns:
(I'm starting at row 2, assumeing headers in row 1)

Column B:
Word Count
=itCOUNTINCELL(A2," ")+1

Column C:
Has Location
=IF(RIGHT(A2,1)=")",TRUE,FALSE)

Column D:
Has Middle
=IF(OR(AND(B2=3,NOT(C2)),B2=4),TRUE,FALSE)

Column E:
Last Word
=IF(OR(AND(B2=2,NOT(C2)),AND(B2=3,C2)),2,3)

Column F:
Loc Word
=IF(C2,B2,FALSE)


Then use those to split out the First, Middle, Last, and Location:


Column H:
First Name
=itGETWORD(A2,1)

Column I:
Middle Name
=IF(D2,itGETWORD(A2,2),"")

Column J:
Last Name
=itGETWORD(A2,E2)

Column K:
Location
=itEXCLUDE(itGETWORD(A2,F2,,,," "),"(",")")


To get itCOUNTINCELL, itGETWORD, and itEXCLUDE, you'll need to
download and install the Free Edition of inspector text:
(it never expires)

http://precisioncalc.com/it


For more information on each of those three functions:

itCOUNTINCELL
http://precisioncalc.com/it/itCOUNTINCELL.html

itGETWORD
http://precisioncalc.com/it/itGETWORD.html

itEXCLUDE
http://precisioncalc.com/it/itEXCLUDE.html


Good luck with your project!


Greg Lovern
mailto:[email protected]
http://PrecisionCalc.com
More Power In Excel
 
R

Ron Rosenfeld

It may be doable, depending on the patterns. You can search the string for the
patterns which define the various substrings you wish to extract.

You'll have to give some examples that include all the variations of what you
want to do.
--ron


Looking at Greg's examples:

First Last
First Middle Last
First Last (Location)
First Middle Last (Location)

IF that is correct, here is another approach with a User Defined Function.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter these formulas for the
particular portions, assuming that your data is in A2. They should return a
blank for Middle or Location if not present.

First Name: =RegexMid(A2,"(\w+)",1)
Middle Name: =RegexMid(A2,"\w+\s+(\w+)\s+(?!\()",1)
Last Name: =RegexMid(A2,"(\w+$|\w+(?=\s+\())",1)
Location: =RegexMid(A3,"\(([^)]+)",1)


=================================================
Option Explicit
Function RegexMid(str As String, sPattern As String, lSubMatch As Long) As
String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = sPattern
If re.test(str) = True Then
Set mc = re.Execute(str)
RegexMid = mc(0).submatches(lSubMatch - 1)
End If

End Function
=================================
--ron
 
R

Ron Rosenfeld

It may be doable, depending on the patterns. You can search the string for the
patterns which define the various substrings you wish to extract.

You'll have to give some examples that include all the variations of what you
want to do.
--ron


Looking at Greg's examples:

First Last
First Middle Last
First Last (Location)
First Middle Last (Location)

IF that is correct, here is another approach with a User Defined Function.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter these formulas for the
particular portions, assuming that your data is in A2. They should return a
blank for Middle or Location if not present.

First Name: =RegexMid(A2,"(\w+)",1)
Middle Name: =RegexMid(A2,"\w+\s+(\w+)\s+(?!\()",1)
Last Name: =RegexMid(A2,"(\w+$|\w+(?=\s+\())",1)
Location: =RegexMid(A3,"\(([^)]+)",1)


=================================================
Option Explicit
Function RegexMid(str As String, sPattern As String, lSubMatch As Long) As
String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = sPattern
If re.test(str) = True Then
Set mc = re.Execute(str)
RegexMid = mc(0).submatches(lSubMatch - 1)
End If

End Function
=================================
--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top