How to Split Cell Content Into Parts

D

doyle60

I have a friend who built a wonderful database but mistakenly put too
much information in one field. For a table of People, he put into one
field the following:

George Mitchell (b. 1976), husband of Mrs. Mitchell (1978-1999)
Hank Natty Barnes (1899-1901), brother of so and so.
Nelly Margeret Stevens (1978-1999) actress
Telsea "Minnie" Mack-Evens (b. 1878)

It has about a thousand records with data similar to the above.

I want to know how to split the above, using a query, into five fields.
Here are the fields and the rules:

FirstName:
If there are three or more words before the first peren, put the first
two words, if two or one word, put the first word only. (Best to treat
hypenations as one word but this can be hand corrected later).

LastName:
Put the word before the first peren.

BirthYear:
Some cells have two sets of perens, so only for the first set if it
takes this form (b. 9999), put 9999, or if it takes this form
(9999-9999), then put 9999 (the birth year, of course).

DeathYear:
If the cell has (9999-9999) then give me 9999 (the last year). No
cells have (d. 9999).

Comment:
All the field's comment after the second peren, beginning with the
first letter.

Even after this, much hand correcting, though a thousand records will
be done.

Thanks for your help,

Matt
 
J

John Nurick

There are two ways of going about this, excluding the option of hiring a
fast typist.

One is to write a series of custom VBA functions, one for each field you
want to extract, that use the native VBA string-handling functions such
as InStr() and Mid() to find and extract the bits you want.

The other is to learn to use regular expressions, which provide a
powerful means of parsing data like this. Here's an example, a function
that I think will parse out the first or last name from your sample
fields:

Function ExtractField(ByVal S As String) As String

'Be sure to set a reference to
' Microsoft VBScript Regular Expressions 5.5
'or the function(s) won't compile

Dim R As RegExp
Dim MC As MatchCollection
Dim MA As Match

Set R = New RegExp
With R
.Pattern = "^([^)]+) ([^ (]+) \("
Set MC = .Execute(S)
ExtractField = MC(0).SubMatches(0)
'Submatches(0) returns FirstName
'Submatches(1) returns LastName
End With
Set MC = Nothing
Set R = Nothing
End Function


Whichever way you choose, there's a lot of hard thinking and careful
testing involved.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top