search free formated field

G

Guest

I have an employee name field with text data type, the employee name get
entered by all type of users so the field format is very different for
example:

John (blank) Smith
Smith, John
John (blank) / (blank) Smith
John/Smith/3669
..
...

I need to take this free format field and append it to an employee table
with the following fields:

lastName
firstName


Regrads
 
A

Allen Browne

Don't you hate it when the source data is so inconsistent? :)

The best you can do with this is to import it into a single field (say
FullName), and then use a series of Update queries to populate the actual
fields.

By examining the data, you have to make a series of assumptions. For
example: if the field contains a comma, that everything before that is the
surname. So, under the FullName field, you enter criteria:
Like "?*,*"
Then in a fresh column in the Field row:
Trim(Left([FullName], Instr([FullName], ",") - 1))
and in the next column in the Field row:
Trim(Mid([FullName], Instr([FullName], ",") + 1))
Check this gives you sensible results.
Then change the query to an Update query (Update on Query menu.)
Move the expressions from the Field row into the Update row under the
lastName and firstName fields respectively.

Now add criteria under lastName and FirstName of:
Is Null
so that subsequent operations don't overwrite these fields.

Then you're off trying to create expressions that solve the next bunch of
operations. InstrRev() is useful for finding the *last* slash or space in
the name. Right() and Len() will be useful. Most expressions need Trim() so
you don't get leading or trailng spaces.

You can also parse a particular word from the field using a custom function
like this:
ParseWord(): Parses the first, last, or n-th word/item from a field/list
at:
http://allenbrowne.com/func-10.html
 
J

John Nurick

If there are really only a few patterns
John (blank) Smith
Smith, John
John (blank) / (blank) Smith
John/Smith/3669
....

this isn't too difficult. It's a matter of working out the exact rules
that apply, and then writing code to apply them. E.g.

Delete any leading or trailing spaces
Delete any spaces adjacent to slashes or commas
If the string contains one or more slashes
everything up to the first slash is the firstname
everything from the first slash to the second, or to the
end of the string if there is only one slash, is
the lastname
Else if the string contains a comma
everything before the comma is the lastname
everything after the comma is the firstname
Else if the string contains exactly one space
everything before the space is the firstname
everything after the space is the lastname
...and so on.

The problems start if some users have typed
John / Smith
and others have typed
Smith / John

They get worse if there are names where the division between "firstname"
and "lastname" requires knowledge rather than application of a rule,
Jon Benet Ramsay
Mary Tyler Moore
Pierre Joseph Marie Teilhard de Chardin
Boutros Boutros Ghali
Emmanuel Le Roy Ladurie
J. Edgar Hoover
Laurens van der Post

or that don't fit the first/last pattern at all

Sukarno
Sitting Bull
Lord Saye and Sele

Handling these hard cases takes a *lot* of programming. Consider using
a commercial tool such as Splitter For Microsoft Access
http://www.infoplan.com.au/splitter/. This isn't perfect (nothing can
be) but at least someone else has done the hard work<g>.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top