3NF

G

Guest

In our Database Management class, our next project deliverable is to
normalize our database to 3NF. The argument has come up that in our database,
any entity that has City, Zipcode and State has a transitive dependency
between the three. One of our team mates says to resolve this issue we should
put all three attributes into one table with CityID (An autonumber) as the
foreign key to any other tables. His reasoning that if you know the City, you
then know the Zip code and State does not work because some of the Cities
have multiple Zip codes.

This would be the relationship schema he wants to use.

tblVendor (VendorCode, Season, BusName, RepName, Address, CityID,
PhoneNumber, AltPhoneNumber, Email)

One issue is that in our database, which is used by a local business who
only deals with people and business in about a 50 mile radius, is that since
there are a limited number of Cities, Zip codes and for the most part the
default value for state is NY, that they are in separate tables for City, Zip
code and State and these are basically used to fill combo boxes on a form for
data entry.

This would be the relationship schema I want to use.

tblVendor (VendorCode, Season, BusName, RepName, Address, CityID, ZipcodeID,
StateID, PhoneNumber, AltPhoneNumber, Email)

The real issue is that we only have to worry if they are transitive
dependencies, the Professor told us to work this out by researching the idea,
and one way to research this would be to ask the experts for their advice. So
any input to us would be greatly appreciated.

Thanks
CoachBarkerOJPW
 
J

Jason Lepack

This would be the relationship schema I want to use.
tblVendor (VendorCode, Season, BusName, RepName, Address, CityID, ZipcodeID,
StateID, PhoneNumber, AltPhoneNumber, Email)

The problem with the above is this example:
tbl_city
CityID, CityName
1, Ottawa
2, Vancouver

tbl_state (Province :D)
StateID, StateName
1, Ontario
2, British Columbia

Now, Ottawa is in Ontario and Vancouver is in British Columbia. The
problem with your schema is that this can occur:

tbl_vendor
city_id, state_id
1,2
2,1

Now I've just created a vendor in Ottawa, BC and Vancouver, Ontario.

I would suggest that you avoid that because your model won't match
reality.

Cheers,
Jason Lepack
 
R

Roger Carlson

I see a several problems.

First of all, your design assumes that a vendor can have only one address
and one Rep. If that's in the business rules (which we are not privy to),
that's fine, but there are plenty of business out there that have muliple
addresses and Reps. I suggest you create a separate Address table and also
a separate Rep table.

Now, each Address can be in only one City, but each City can certainly
contain more than one Address, so there is a one-to-many relationship there
as well. So the idea of a separate City table is correct, but it is related
to the Address table rather than the Vendor table.

Each City can be in only one State, but each State can contain more than one
City. So we also have a 1:M relationship there too. Thus you need a State
table related to the City table and not directly to the Vendor (or Address)
table.

Each City can have more than one Zip code AND each Zip code can have more
than one City. This is a Many-to-Many relationship. (frex: New York City
contains muliple zipcodes, however, the zip code that I live in spans
multiple cities but not *all* of each of those cities). Therefore, you need
to create a ZipCode table and create an intersection table between City and
ZipCode which contains the primary keys of the other two tables as foreign
keys.

Lastly, I am concerned about PhoneNumber and AltPhoneNumber and Email. What
if the vendor has more than 2 contact numbers, say a cell number? Or if the
vendor has several email addresses? One way to handle this is to have a
"Contact" table that would store all of these. Something like this:
Contact(ContactID, ContactType, ContactString). In ContactType, you would
store the type of contact (Phone, AltPhone, Cell, Email, etc.) and in
Contact String, you would store the actual value. Of course, you would want
to have a look-up table for each ContactType also (to reduce data integrity
errors) so you'd have another table for ContactType related to Contact.

Because this is an assignment, I'm not going to give you the actual design
details, but you should be able to piece them from here.

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
J

Jamie Collins

In our Database Management class, our next project deliverable is to
normalize our database to 3NF. The argument has come up that in our database,
any entity that has City, Zipcode and State has a transitive dependency
between the three.

Have you done the "Is an address an entity?" debate yet? It's a goodie
<g>.

Consider this approach: an address is an attribute of an entity not an
entity in itself; the meaning of 'address' in this context being, "The
text I have to put on the address label of an item of mail for the
post office to deliver it to the 'mailable' entity". Then you start to
think about what the post office expects e.g. USPS expects five lines
of CHAR(35).

If your business has a need to identify entities sharing an address it
may be better to model addresses as entities. If your business has a
need to identify entities with an address in the same administrative
area/city/county/electoral ward/etc it may be better to have an
attribute for administrative area/city/county/electoral ward/etc but
because 'no man is an island', rather than roll your own CityID,
instead look for a industry standard key e.g. here's one I did the
other day relating to the UK:

ISO 3166-2:GB administrative divisions of the UK (e.g. Staffordshire
=
'GB-STS'):
http://en.wikipedia.org/wiki/ISO_3166-2:GB

UK Internal [Country] Code (e.g. England = 1)
http://www.govtalk.gov.uk/gdsc/html/frames/UKinternalCode.htm

ISO 3166-1 alpha-3 country codes (e.g. United Kingdom of Great
Britain
and Northern Ireland = 'GBR'):
http://en.wikipedia.org/wiki/ISO_3166-1_alpha-3

Another thing to be aware of is attribute splitting (google it) where
you have to create joins between 100 tables just to be able to send
Auntie Mabel a birthday card, so don't let lofty ideals of data
modellers and relational theorist make you lose track of the business
problem in hand. And speaking of taking things to the nth degree, this
is an interesting article:

Design Challenge: Global Address
http://www.dmreview.com/portals/portalarticle.cfm?articleId=1062041&top

Are you getting the idea that the meaning of 'address' isn't clear
cut...? :)

Jamie.

--
 
R

Roger Carlson

Jamie makes a good point and I'd like to amend my answer to address that as
well.

I approached the answer from a pure academic normalization standpoint
because the point is to get a good grade. The Address-problem makes an
interesting and challenging assignment. As a college instructor myself, I
would be more interested in the logic behind your choices than the actual
answer.

However, in the real world, I have never designed a database with the
Address normalized to that extent. Generally, I keep all the fields in the
same table. I may have a look-up table for States and even Cites,
(especially if I'm going to search or sort on them) but that's all. There
comes a point when the cost of establishing and maintain the relationships
is more than the benefit derived.

For instance, the City -- Zip relationship would take a long time to fill an
intersection table with all the possible combinations. The only reason to
do so would be to make sure the user could never enter an impossible
City/Zip combination. But it's relatively easy for a person to view the
data and correct it manually. Furthermore, addresses are relatively static,
so it's not truly transactional data anyway. The same is true for City --
State.

On the other hand, I still think the Contacts table is a good idea because
it doesn't limit the number of alternate phone numbers or email addresses
you allow a vendor and makes your database much more robust to meet changing
business rules.

All this to say that sometimes fully normalized is not the optimum design,
but you should have good reasons to vary from it.

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
G

Guest

Thank you all for your well thought out advice. I will take these suggestions
back to the group and see what we can do with them. We are learning that
there are times you have to draw the line as to how far you want to go.

Have a good day
CoachBarkerOJPW

Jamie Collins said:
In our Database Management class, our next project deliverable is to
normalize our database to 3NF. The argument has come up that in our database,
any entity that has City, Zipcode and State has a transitive dependency
between the three.

Have you done the "Is an address an entity?" debate yet? It's a goodie
<g>.

Consider this approach: an address is an attribute of an entity not an
entity in itself; the meaning of 'address' in this context being, "The
text I have to put on the address label of an item of mail for the
post office to deliver it to the 'mailable' entity". Then you start to
think about what the post office expects e.g. USPS expects five lines
of CHAR(35).

If your business has a need to identify entities sharing an address it
may be better to model addresses as entities. If your business has a
need to identify entities with an address in the same administrative
area/city/county/electoral ward/etc it may be better to have an
attribute for administrative area/city/county/electoral ward/etc but
because 'no man is an island', rather than roll your own CityID,
instead look for a industry standard key e.g. here's one I did the
other day relating to the UK:

ISO 3166-2:GB administrative divisions of the UK (e.g. Staffordshire
=
'GB-STS'):
http://en.wikipedia.org/wiki/ISO_3166-2:GB

UK Internal [Country] Code (e.g. England = 1)
http://www.govtalk.gov.uk/gdsc/html/frames/UKinternalCode.htm

ISO 3166-1 alpha-3 country codes (e.g. United Kingdom of Great
Britain
and Northern Ireland = 'GBR'):
http://en.wikipedia.org/wiki/ISO_3166-1_alpha-3

Another thing to be aware of is attribute splitting (google it) where
you have to create joins between 100 tables just to be able to send
Auntie Mabel a birthday card, so don't let lofty ideals of data
modellers and relational theorist make you lose track of the business
problem in hand. And speaking of taking things to the nth degree, this
is an interesting article:

Design Challenge: Global Address
http://www.dmreview.com/portals/portalarticle.cfm?articleId=1062041&top

Are you getting the idea that the meaning of 'address' isn't clear
cut...? :)

Jamie.
 
J

Jamie Collins

In our Database Management class, our next project deliverable is to
normalize our database to 3NF.

We are learning that
there are times you have to draw the line as to how far you want to go.

FWIW there was a good discussion on comp.databases.theory recently,
entitled, "Who first (publicly) asserted 3NF is 'good enough'?"

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top