Data Normalization

R

R Tanner

I understand what the purpose and value of data normalization is, but
I just have one thing I am confused about. If you split up a table,
then it will create two tables. The child table will have a lookup
column to the parent table, which will have whatever value is
applicable that relates that given record to it's parent. What I
don't understand is how that is more efficient. In fact it seems to
duplicate the data if anything...
 
K

Klatuu

It duplicates only the value necessary to join the tables, not every field
in the parent table.

Although there is not universal consensus on this, my preference is to use
an Autonumber Primary key for the parent records, and a Long Integer field
for the foreign key fields in the child records.

This requires only 4 bytes per record.
There is also a performance gain because it is usually faster when
rebuilding the tables indexes.
If you have multiple field as a primary key, that process takes longer, and
if the fields are not all the same data type, it can get really slow.
 
J

John W. Vinson

I understand what the purpose and value of data normalization is, but
I just have one thing I am confused about. If you split up a table,
then it will create two tables. The child table will have a lookup
column to the parent table, which will have whatever value is
applicable that relates that given record to it's parent. What I
don't understand is how that is more efficient. In fact it seems to
duplicate the data if anything...

It duplicates *one field*. Your related table may have many fields!

Just for example, consider a People table with a PersonID, related one to many
to an Address table. Each person may have several addresses; each record of
the address table would have fields like AddressNo, Direction, Street, Suffix
(i.e. St., Ave., Blvd.), Postcode, City, State and Country. These fields don't
need to occur in the parent table and need not be duplicated.
 
R

R Tanner

It duplicates *one field*. Your related table may have many fields!

Just for example, consider a People table with a PersonID, related one tomany
to an Address table. Each person may have several addresses; each record of
the address table would have fields like AddressNo, Direction, Street, Suffix
(i.e. St., Ave., Blvd.), Postcode, City, State and Country. These fields don't
need to occur in the parent table and need not be duplicated.

Okay that example makes sense when I really think about it. I don't
know why I am having so much trouble grasping this concept. Let me
give you an example and tell me if I am right with this thought
process...

Say you have 5 different invoices. In an unnormalized table, on each
line, you would see the product, quantity, price, customer address,
customer name, and invoice number. Since there are 5 different
invoices, you would have 5 different records here in one table. 3
with the same customer, 2 with a different customer.

To normalize the table, you would want to split the table to create a
customer table with the customer name and address. The invoice number
would be the primary key in the invoices table and then a foreign key
in the customers table.

If I wanted to have a form update my database with a new invoice, I
would link the form to the invoices table and have the form add the
details, with a field on that form that allows for the entry of the
name of the customer. Is that correct?
 
R

R Tanner

It duplicates *one field*. Your related table may have many fields!

Just for example, consider a People table with a PersonID, related one tomany
to an Address table. Each person may have several addresses; each record of
the address table would have fields like AddressNo, Direction, Street, Suffix
(i.e. St., Ave., Blvd.), Postcode, City, State and Country. These fields don't
need to occur in the parent table and need not be duplicated.

Also in reply to your post about the primary and foreign keys - -
When I use the table analyzer and I split a table, the lookup column
to the parent table will have the actual values that are in the parent
table. Should this be the case, or would it not take up less memory
in the database to have the lookup column look up the primary key in
the parent table, which would be an autonumber...
 
J

John Spencer

In Response to your question on Invoices and customers.
You came pretty close, except the INVOICES table would contain a
CustomerID foreign key. The Customer table would not have an invoice id.

One Customer has (or could have) many Invoices. The many side gets the
Foreign key.


In response to your question on Primary keys
The Primary key should meet the following constraints:
A) It must be UNIQUE. No duplications
B) It should be stable (that is it should not change over time). There
are ways to handle changes in the value but if possible (in my opinion)
they should be avoided.
C) It must be available for every record.
Social Security Numbers (SSN) for people in the US are problematic. New
Borns don't have them immediately, visitors from overseas don't have
them, sometimes people won't share them, etc. It depends on the purpose
of the database whether an SSN would be a good candidate for a primary key.

Names of people are not unique and are not stable (marriage, divorce,
etc). Names of the States in the United States are stable and probably
will not change. Abbreviations for names of the States are stable and
do not change frequently (To Date: once in my lifetime).

So there is nothing wrong with using a Natural Key (some item or items
of data in each record) or with using an artificial key (autonumber or
other unique value that has no relation to the contents of the record).
This subject Natural vs Artificial can become a holy war with some
individuals. I use both depending on the table and situation.

'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================
 
R

R Tanner

In Response to your question on Invoices and customers.
You came pretty close, except the INVOICES table would contain a
CustomerID foreign key.  The Customer table would not have an invoice id.

One Customer has (or could have) many Invoices.  The many side gets the
Foreign key.

In response to your question on Primary keys
The Primary key should meet the following constraints:
A) It must be UNIQUE. No duplications
B) It should be stable (that is it should not change over time).  There
are ways to handle changes in the value but if possible (in my opinion)
they should be avoided.
C) It must be available for every record.
Social Security Numbers (SSN) for people in the US are problematic.  New
Borns don't have them immediately, visitors from overseas don't have
them, sometimes people won't share them, etc.  It depends on the purpose
of the database whether an SSN would be a good candidate for a primary key.

Names of people are not unique and are not stable (marriage, divorce,
etc).  Names of the States in the United States are stable and probably
will not change.  Abbreviations for names of the States are stable and
do not change frequently (To Date: once in my lifetime).

So there is nothing wrong with using a Natural Key (some item or items
of data in each record) or with using an artificial key (autonumber or
other unique value that has no relation to the contents of the record).
   This subject Natural vs Artificial can become a holy war with some
individuals. I use both depending on the table and situation.

'====================================================
  John Spencer
  Access MVP 2002-2005, 2007
  Center for Health Program Development and Management
  University of Maryland Baltimore County
'====================================================






- Show quoted text -

So let me pose this question to you John. I am trying to split one
table into about 4 other tables. These are the steps I am taking:

1. Run a Total's Query on the data I want to have in the parent
table.
2. Run a Make Table Query
3. Identify my primary key in the new Parent table. (the data will
not change, but more may be added, so I have a boolean helper column
called current I use in my SQL statement)
4. Create the relationship between the child table and my new parent
table. (I think maybe I am doing something wrong here. Two things I
understand about creating these relationships:

1. The datatype must be the same
2. The primary table can only maintain a
relationship with the child table through it's primary key (Because of
this, I identified my actual data as the primary key and created the
relationship between it and the actual data in the child table. What
I have been trying to do is figure out how to build the SQL statement
that will create the issue ID column I need in my child table that
contains the foreign key)

Does anything sound amiss in this scenario to you? Thank you for
emphasizing that the foreign key is only on the many side. I think I
vaguely recognized that, but you helped to clear it up.
 
R

R Tanner

So let me pose this question to you John.  I am trying to split one
table into about 4 other tables.  These are the steps I am taking:

1. Run a Total's Query on the data I want to have in the parent
table.
2. Run a Make Table Query
3. Identify my primary key in the new Parent table.  (the data will
not change, but more may be added, so I have a boolean helper column
called current I use in my SQL statement)
4. Create the relationship between the child table and my new parent
table.  (I think maybe I am doing something wrong here.  Two things I
understand about creating these relationships:

                  1. The datatype must be the same
                  2. The primary table can only maintain a
relationship with the child table through it's primary key (Because of
this, I identified my actual data as the primary key and created the
relationship between it and the actual data in the child table.  What
I have been trying to do is figure out how to build the SQL statement
that will create the issue ID column I need in my child table that
contains the foreign key)

Does anything sound amiss in this scenario to you?  Thank you for
emphasizing that the foreign key is only on the many side.  I think I
vaguely recognized that, but you helped to clear it up.- Hide quoted text-

- Show quoted text -

Also, in the following SQL statement, what is the purpose of the xyz?

SELECT [ID] AS xyz_ID_xyz, [ID] AS xyz_DispExpr_xyz, [Issue], [If
Other] FROM Issues ORDER BY [Issue], [If Other];
 
J

John W. Vinson

When I use the table analyzer and I split a table, the lookup column
to the parent table will have the actual values that are in the parent
table. Should this be the case, or would it not take up less memory
in the database to have the lookup column look up the primary key in
the parent table, which would be an autonumber...

Well... I'd suggest NOT using the table analyzer. It is no match for a human
intellect; and it uses the misleading, misdesigned, aggravating Lookup Field
type. Your lookup column in the parent table APPEARS to have the text values
BUT IT DOES NOT; it contains long integer foreign key values. The result is
correct (that's how I'd do it manually) but the method is suspect and can be
very confusing (it's succeeded in confusing you hasn't it!?)
 
R

R Tanner

Well... I'd suggest NOT using the table analyzer. It is no match for a human
intellect; and it uses the misleading, misdesigned, aggravating Lookup Field
type. Your lookup column in the parent table APPEARS to have the text values
BUT IT DOES NOT; it contains long integer foreign key values. The result is
correct (that's how I'd do it manually) but the method is suspect and canbe
very confusing (it's succeeded in confusing you hasn't it!?)

And yes, I did notice that the lookup column has the text, but the
datatype is a number. Very confusing and even with my limited
knowledge it seemed like it could cause problems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top