Parsing in Access

Z

Zardoz

Damit! I try this update in a query and it randomly cuts off text!!

UPDATE Pathfindermerge SET Pathfindermerge.Street =
trim(InStr([Address]," ")-1);

sample field looks like this:

MARLEEN HOUSTON

what happens is for example in the case of the city "Houston", I wind
up with something like "ouston" or if its "Pasadena", I wind up with
"adena". The program seems to be electively discounting the consonants.


What I really need is something to parse this field:

1218 FRESA PASADENA, TX 775020000
442 COUNTY ROAD 470 Apt# ETOILE, TX 75944-6408
5128 PINE AVENUE PINE TREE ASSISTED LIVIN PASADENA, TX 775030000
^note the spaces that occur ^I guess
the four spaces denoted a field

and so on... into seperate building number, apartment number, street
info, City, State, Zip and Plus 4 fields. What commands would I use to
identify a character sequence, such TX or CA or Apt and begin capturing
that string until it hits the next occuring "SPACE"? Somwhere, someone
knows how to do it, maybe a god or alien being or bionic man but
certainly not me!!!
 
D

Douglas J. Steele

The Trim function removes preceeding or trailing spaces. That's it.

However, I'm surprised you're not getting an error with that SQL statement,
as InStr([Address], " ") is going to give you a number (the position of the
first space in Address, or 0 if there are no spaces in it), and Trim
shouldn't work on a numeric field.

You've definitely got a non-trivial problem. If there are always 4 spaces
before the city, state and zip, you should be able to get that without too
much problem (Mid([Address], InStr([Address], " ") + 4) should give you
the city, state and zip code. Splitting that at the comma should give you
city and state, and if you know it's always a 2 character state
abbreviation, that should be easy to get out). The rest, though, will be
tricky.

To loop through a string, look at each character in turn, you'd use
something like:

Dim intLoop As Integer
Dim strChar As String

For intLoop = 1 To Len(strAddress)
strChar = Mid(strAddress, intLoop, 1)
' strChar will now be a character from the address

Next intLoop
 
J

John Nurick

As Doug says, it's a non-trivial problem. It is quite likely that there
will be ambiguities in the data that make it impossible to achieve a
100% solution.

But you can get a long way by using regular expressions. For instance,
this pattern (expanded by adding more states if needed)
"^.+\s{2,4}(\w+.*?),?\s+(AZ|LA|NM|TX)\s+(\d+-?\d*)\s*$"
will extract the city, state and zip code from your samples.

There's a sample function here which uses a regular expression to parse
many US "street" addresses (in your data, the part before the city
name), though it will need modification to handle your samples):
http://www.j.nurick.dial.pipex.com/Code/vbRegex/ParseAddress36.htm

Links on that page take you to more functions and information about
using regular expressions to parse text data.

Damit! I try this update in a query and it randomly cuts off text!!

UPDATE Pathfindermerge SET Pathfindermerge.Street =
trim(InStr([Address]," ")-1);

sample field looks like this:

MARLEEN HOUSTON

what happens is for example in the case of the city "Houston", I wind
up with something like "ouston" or if its "Pasadena", I wind up with
"adena". The program seems to be electively discounting the consonants.


What I really need is something to parse this field:

1218 FRESA PASADENA, TX 775020000
442 COUNTY ROAD 470 Apt# ETOILE, TX 75944-6408
5128 PINE AVENUE PINE TREE ASSISTED LIVIN PASADENA, TX 775030000
^note the spaces that occur ^I guess
the four spaces denoted a field

and so on... into seperate building number, apartment number, street
info, City, State, Zip and Plus 4 fields. What commands would I use to
identify a character sequence, such TX or CA or Apt and begin capturing
that string until it hits the next occuring "SPACE"? Somwhere, someone
knows how to do it, maybe a god or alien being or bionic man but
certainly not me!!!
 
Top