Normalization Question

P

PA

I search the archives and couldnt come up with an answer to the following,

I am creating a new Database of business contacts for my company, with
perhaps thousands of records in multiple cities. In the "tbleName" is such
data as fname, lname, address1, address2, city, state, zip, etc. There will
be great repetition of city and state since, for example, I may have 7 or 8
zipcodes within one large city, such as Chicago. Would it a better design
feature, from the point of view of normaliztion, to have a zipcode table,
containing "zipcode(primary key), city, and state", and just create a field
in the tblename for zipcode(foreign key), and join the two table?

Thanks for the advice.

Paul
 
D

Douglas J. Steele

What are you going to do about those zip codes that cover more than one
town?
 
R

Rick Brandt

PA said:
I search the archives and couldnt come up with an answer to the
following,

I am creating a new Database of business contacts for my company, with
perhaps thousands of records in multiple cities. In the "tbleName"
is such data as fname, lname, address1, address2, city, state, zip,
etc. There will be great repetition of city and state since, for
example, I may have 7 or 8 zipcodes within one large city, such as
Chicago. Would it a better design feature, from the point of view of
normaliztion, to have a zipcode table, containing "zipcode(primary
key), city, and state", and just create a field in the tblename for
zipcode(foreign key), and join the two table?

Thanks for the advice.

Paul

I addition to Douglas' comment I always consider zip code data (as it relates to
city) to be time sensitive. New zip codes could be created that change the
picture later. While this doesn't happen with the frequency of new area codes
it does still occur. That being the case I would keep all the fields in the
contacts table.
 
P

PA

You both have good reasons for repeating the city and state repetitively and
that is what I had roughed out.
The thought of breaking it out occured to me because of some reading I did
years ago - I believe it was from an Element K monthly publication but not
sure if that was the source, where they gave an example of building a student
database at an imaginary school, and they did break out the zip code to
conform to the three normal forms and Boyce Codd.
 
K

Ken Sheridan

Zip codes do not fit well with other geographical units, so I would agree
with Rick and keep them in the Contacts table. To ensure data integrity, and
ensure that only valid city/zip code combinations can be entered in the table
you'd need a another table with columns ZipCode and CityID and an enforced
relationship between Contacts and this table on both the CityID and ZipCode
columns. I suspect most people would not go to these lengths, however.

In the UK its different as a post code here is very specific geographically
(mine refers to the 35 houses on one side of my street), so coupled with the
house name or number its possible to pin down an address. Consequently these
are all that is commonly stored in many address tables here. The Street,
City and County are determined by the post code so need not be stored in the
table, but are known via the referenced PostCodes table.

The Contacts table should not have a State column however as this is
functionally dependent on the non-key City column. Instead you should have a
Cities table with columns CityID, City and State; city names can be
duplicated so a unique CityID is necessary as the key. The Contacts table
would have a CityID foreign key column. The States table would have a State
column, and perhaps another column with the full name of each state the
abbreviated form being the key.

You'll find a demo of how to handle geographical hierarchies like this by
means of combo boxes on forms at:


http://community.netscape.com/n/pfx...yMessages&tsn=1&tid=23626&webtag=ws-msdevapps


The demo uses the local administrative areas here of parish, district and
county, but the principle is exactly the same.

One issue with this type of data is that of a missing link in the chain.
This arises for instance with international databases where some countries
have regional units, e.g. states in the USA, counties in the UK, départements
in France, but other small countries don't have a regional structure.
Consequently if you have Cities, Regions and Countries tables, if for cities
in those countries without a regional structure the Region was left Null,
then there would be no way of knowing which country the City is in. The
solution here is to have a row in Regions with values of a RegionID, 'N/A'
and the country name in its three columns. The RegionID foreign key column
in the Cities table would have the value for the N/A row for the country in
question.

It might also be asked why have a Countries table at all when the country
names are stored in the regions table? It’s a matter of data integrity; by
enforcing referential integrity between Countries and Regions this ensures
that only valid country names can be entered into the Regions table.
Otherwise you might have the same country entered differently, even if only
by a simple typo.

Ken Sheridan
Stafford, England
 
P

PA

Thanks for the details, I will digest your suggestions and try to implement.
I have downloaded the example you reference and will take a look, later this
afternoon (US EST).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top