Linking Records between two tables using queries

G

Guest

First of all I realize that this is an extra long post and I'm guessing a
challenging one, too. A gracious thank you in advance for taking the time to
read, ponder and possibly suggest a solution!

I’m using MS Access 2002 and am trying to create a link between
corresponding records in two different tables. We store our data in a mysql
database and use Access to work with the data. We currently have four tables
with a total of about 3.5 million records containing around 60 fields per
record. Three tables are very similar, each containing property information
collected by the tax appraiser for their respective counties. The
‘corporation’ table is different than the tax appraiser tables and is much
larger with about 3 million records. We need to create a link between about
5,000 records we have identified (using queries) in the three tax appraiser
tables and their corresponding records in the corporation table. I’m
guessing that the best way to do this may be to append the unique ‘corporate
number’ from the corporate table to the corresponding record in the tax
appraiser table. The unique corporation number would then become our common
element between the two records with which to create the link.

Two things are making this extra challenging for me: (1) the corporate name
in the tax table (the search criteria for searching the corporation table)
may not match-character-for-character the corporate name in the corporate
table due to abbreviations, misspellings, etc… and (2) using the tax
appraiser value for ‘corporation name’ as criteria to search the corporation
table may produce multiple matches since different corporations can have the
exact same name (but have unique corporate numbers). In these cases, clues
from the address fields often lead us to the correct match.

My best guess of a possible approach to automating the process of appending
the corporate number from the corporate table to the corresponding
corporation in the tax appraiser table:
1. Search for exact matches of corporate name and street address (or maybe
just street number and zip code since abbreviations are often used for
‘road’, ‘street’, etc…) using the tax appraiser table’s values to search the
corporate table. For these ‘exact’ matches the computer could automatically
append the corporate number to the corresponding record in the tax table
without user verification, except in cases where there are multiple matches.
For cases where there are multiple matches, the user could select the correct
record from a list of matches.
2. Search for matches of street number and zip from table to table to catch
some of the matches that were missed in step one. We could follow the same
process as in step one for multiple matches.
3. To automate matching records that steps (1) and (2) missed we could
search the first few characters of the corporation and matches for either
city, zip, street number, etc… We could follow the same process as in step
one for multiple matches.
5. And so on, until all corporations are linked. The ones that we can’t
match through this process will have to be done manually. For some of the
records, we won’t be able to identify the correct match. In these cases
we’ll just have to use the name and address supplied by the tax appraiser for
our mailings.

Thank you so very much for reading and for any possible help! I eagerly
anticipate your response!
 
T

Tom Ellison

Dear Joel:

As a starting concept, what you propose is a pretty close and not unlikely
"theory".

In my experience, a great deal can be learned as you begin to perform this.
Various clues you may not have originally perceived may come up as the work
progresses.

Given the large amount of data you have, the manual job must be kept to an
absolute minimum. Even a quick once-over matching these is going to be very
time consuming. If someone spends just 10 seconds each looking at 3.5
million rows, thats 10,000 man hours (5 man years!) Likely this just isn't
feasible.

On the other hand, rekeying all this is a far greater problem, especially
with 60 columns of information at stake.

I am wondering also about your choice of Access for the job. 3.5 million
rows at 60 columns is likely to be a database very near the 2 gigabyte limit
for Jet or MSDE when you get it all into Access. If you really want to do
this, I recommend at least using the new SQL Express with a 4 gigabyte limit
and some very good capabilities for handling such large amounts of data.
Indeed, if you should somehow outgrow this, you could then readily upsize to
SQL 2005 and not have to worry about size any more, and you will have lost
none of the work along the way (the migration upward is extremely painless,
as these are versions of the same product).

Even this is just a guess. The first step really needs to be a careful
sizing of the product. No sense getting an investment in something that
can't be used to complete the job.

I recommend you really go for an experienced professional to handle this
job. You'll spend a bit of money, but you'll likely get a good job if you
find a good hand.

Tom Ellison
 
G

Guest

Dear Tom,

Thank you for your quick response to my post last evening. We currently do
store our database in MySql and view the data through MS Access using ODBC.
You suggested hiring someone for the job. Do you have any suggestions?
Thanks again for your response and I look forward to hearing from you.

Sincerely,
Joel
 
T

Tom Ellison

Dear Joel:

Advertising, soliciting, or making recommendations like you request is not
appropriate in the newsgroups. You may contact me by email. I use my
actual email address in the newsgroup. It is (e-mail address removed).

Tom Ellison
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top