parsing lengthy text fields

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I've got an interesting situation for parsing assistance:

I've got years of data on tree pathology that was entered manually way 'back
in the day' and now I want to parse them into their own fields based on the
year of observation. Entries might look something like (three examples):

1012-- broken top- large percent of crown gone(97); witches brooms
(97)-minor, upper branches(99); broken top-old, with new leaders(98)

1008-- dead top(97)(98)(99); needle damage-black mold on old
needles(97)-black specks(99); sparse foliage (97); reduced crown-1/2(98); dt;
sf- e2(00)

defoliation on top; white fluffy stuff w/abnormal growth in new needles;
scar-small, healing, on base(99)

where the memo field may or may not be iniated with the tree tag number (a
primary key), followed by pathology observations with the year of observation
in parentheses, which are usually followed by a semi-colon. The semi-colon
is expendable, and will need to be deleted.
I'm relatively new to queries, and know nothing of visual basic, though
would like to learn. Any help would be appreciated! Thanks!
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The first thing you need to do is design the table(s) that will hold the
results of your parsing. I'd suggest 1 data table and 1, or more,
"lookup tables" that hold the attributes of the observation. E.g.:

(Using JET SQL)

The lookup table:

CREATE TABLE ObservationTypes (
ObsID COUNTER NOT NULL UNIQUE,
ObsDescription TEXT(255) NOT NULL PRIMARY KEY
)
The Primary Key (PK) on the ObsDescription column prevents the
duplicates of the same description. ObsID as a unique column allows it
to be used as a Foreign Key (FK) by other tables.

The data table:

CREATE TABLE TreeObservations (
TreeTag INTEGER NOT NULL ,
ObsYear INTEGER NULL ,
ObsID LONG NOT NULL REFERENCES ObservationTypes
)

The best PK for the TreeObservations table would be the TreeTag, ObsYear
& ObsID. But, since the ObsYear can be NULL, that won't work. Next PK
could be TreeTag & ObsID, but that would mean there could be only one
type of ObsID for that TreeTag number. E.g.:

TreeTag ObsYear ObsID
25 NULL 5
25 1988 5 <- Not allowed, cuz 25,5 = 25,5 - duplicate

So, you may just have to fore-go a PK on that table. At least put
indexes on all the columns so you can have efficient searches.

I believe that if the observation doesn't have a Tree Tag number, the
observation is useless, 'cuz we don't know which tree this refers to or
where it is.

So, now you have to parse the data & fill the tables. Fill the
ObservationTypes table first. Use a Recordset & sequentially go thru
the original table parsing the observations out of the Memo column and
putting them in the TreeObservations table. Have an error trap that
ignores duplicates (error 3022). There are a bunch of parsing functions
on the internet, just search Google for "parsing VB."

Once the ObservationTypes table is filled, fill the TreeObservations
table using a recordset:

1) Get the Tree Tag Number, if no number skip this row.

2) For each observation in the row, write the TreeTag, ObsYear and ObsID
number into TreeObservations. To get the ObsID use something like this:

Dim strObs As String
Dim strSQL As String
Dim i As Integer

strObs = ParseString(rs(0), 1, ";")
i = 1
Do While Len(strObs) > 0

strSQL = "SELECT ObsID FROM ObservationTypes " & _
"WHERE ObsDescription = '" & strObs & "'"
i = i + 1
strObs = ParseString(rs(0), i, ";")

Loop

ParseString syntax:

ParseString(<search string>, <delimiter number>, <delimiter>)

<delimiter number> is the the 1st, 2nd, 3rd, etc. delimiter in the
search string.

3) Go to the next row & repeat 1 & 2 until no more rows in the
recordset.

HTH,
--
MGFoster:::mgf00 <at> earthlink <decimal-point> net
Oakland, CA (USA)

-----BEGIN PGP SIGNATURE-----
Version: PGP for Personal Privacy 5.0
Charset: noconv

iQA/AwUBQfF/5YechKqOuFEgEQIuEACeOrD1qvIcDjBc0h+vy8y7xWQVv4EAoKoE
gIZLEdhco1N2Fc+83CcdVxcO
=flwU
-----END PGP SIGNATURE-----
 
I can get this done for you in a short time. Contact me at my email address
below if you would like my help.
 
Back
Top