Normalisation - any quick guides to this?

R

Robin

Hi

I thought I understood Access - I've built some fairly complex applications
in the past and have a fairly good understanding of VBA, but these have
generally been what I would call Reporting Tools. I've now been given what
seems to be a fairly simple job but I am struggling to know where to start,
because this really needs to be a proper relational database.

Could anyone point me to a good overview of the process of Normalization, or
some best practice steps for properly identifying relationships? (I know
how to create relationships - just can't work out the best options!)

Many thanks!

Robin
 
R

RuralGuy

Hi

I thought I understood Access - I've built some fairly complex applications
in the past and have a fairly good understanding of VBA, but these have
generally been what I would call Reporting Tools. I've now been given what
seems to be a fairly simple job but I am struggling to know where to start,
because this really needs to be a proper relational database.

Could anyone point me to a good overview of the process of Normalization, or
some best practice steps for properly identifying relationships? (I know
how to create relationships - just can't work out the best options!)

Many thanks!

Robin

Hi Robin,
You may find this link useful:
http://www.datamodel.org/NormalizationRules.html

_______________________________________________
hth - RuralGuy (RG for short)
Please post to the NewsGroup so all may benefit.
 
R

Robin

RuralGuy said:
Hi Robin,
You may find this link useful:
http://www.datamodel.org/NormalizationRules.html

_______________________________________________
hth - RuralGuy (RG for short)
Please post to the NewsGroup so all may benefit.

Thanks RG.

I'm not sure I will ever understand beyond 5NF but I guess that's good
enough. I'm going to read 1 - 5 again, several times.

My particular problem is something similar to a temp recruitment agency,
where there are people - placed with employers - in particular job roles,
but where employers might have multiple addresses, and people might possibly
do 2 or more different jobs at the same time, possibly with different
employers! There are also notes kept of each particular job placement which
need to stay with both employer and person when that person moves on.

Ultimately, I need a form bound to a table or query, showing each person,
where they are and what they are doing. I can't get my head round what that
record source should be - people, employers or jobs!

Maybe I'm thinking too much :)

Robin
 
T

tina

Ultimately, I need a form bound to a table or query, showing each person,
where they are and what they are doing. I can't get my head round what that
record source should be - people, employers or jobs!

forget about forms; you don't base relational design decisions on *display*
needs. structure your tables/relationships to correctly model the "real
world" relationships between the entities FIRST. then you can build forms
and reports to interact with and display the data any way you want. for more
reading on relational design, see
http://home.bendbroadband.com/conradsystems/accessjunkie/resources.html,
beginning with the Starting Out and Database Design 101 links. also, i often
recommend Database Design for Mere Mortals by Michael Hernandez. don't be
afraid to spend quality time working on the tables/relationships design - it
will repay you a thousand-fold in saved time later, as you build the rest of
the database.

hth
 
G

Guest

My first concept after reading your description is this:

An employer can have one or more employees.

An employee can have one or more employers.

This is a many to many relationship, resolved with a third table, perhaps
called "EmployeeAssignments", with foreign keys to the Employer table and to
the Employee table. A job "role", as you called it, could be captured by a
description field in the EmployeeAssignments table. A note field here,
associated with each employee assignment, would of course always be able to
tell you the employer and employee.

Finally, Employers can have one or more addresses.

That's four tables so far:
Employers
Employees
EmployeeAssignments
Addresses

Just a quick sketch...
 
R

Robin

Many thanks for the advice, and for the link - there's enough there to keep
me going for several weeks!

BTW, I wasn't really trying to do things backwards but I take the point :)

I've just realised the process of structuring tables/relationships is
entirely about relationships in the real world, and has nothing whatsoever
to do with what that data will be used for. Bit of a revelation!
 
G

Guest

Robin:

One thing you should try to gain some understanding of is the concept of
'functional dependency' as it is this which underlies the 'rules' of
normalization. A column in a table is functionally dependent on another
column if its value is determined by the value of the first column. If an
employee has an EmployeeID of 42 and a first name John and a last name Brown
the the FirstName and LastName columns are functionally dependent on the
EmployeeID column as every time EmployeeID 42 is encountered it refers to
John Smith.

Essentially, as far as the first 3 normal forms are concerned, non-key
columns must be dependent solely on the whole of a table's key. I recall
John Vinson summing this up one as :

"The key, the whole key and nothing but the key, so help me Codd"

Lets say John Smith lives in San Franciso, CA, so we add City and State
columns to the Employees table to record this. Now the City column (assuming
unrealistically for the moment that we don't record any more detailed
location for each employee) is functionally dependent solely on the
EmployeeID. You might think its also functionally dependent on the
FirstName/LastName column combination too, but that's not the case as there
could be another employee John Smith who's in New York. So City is
functionally dependent on the whole of the key, and so far the table is
correctly normalized. When we come to State, however, it is functionally
dependent on EmployeeID, BUT its also functionally dependent on City as San
Francisco determines that the State must be California (there may be another
San Francisco for all I know, but if so we'll ignore that for the sake of
this example and assume all city names are unique). So State is not
functionally dependent solely on the key (EmployeeID) but also on the non-key
column City. It is said to be transitively functionally dependent as
EmployeeID determines City which in turn determines State. So this breaks
the rule and the table is not correctly normalized to third normal form.

The solution to such 'redundancy' is to decompose the table. Remove the
State field from Employees and create a Cities table with columns City and
State. Create a States column with column State (possibly with two columns
if you want to store the full name as well as the abbreviation for each
State). Enforce referential integrity in the relationships between Employees
and Cities and between Cities and States. This ensures a non-existent city
can't be entered into Employees or a non-existent State into Cities. In
reality, as city names can be duplicated you'd have a numeric CityID in
Employees referencing a unique numeric CityID primary key of Cities.

I hope the above has given some idea of the basics of how functional
dependencies between columns in a table are used to determine whether a table
meets the normalization rules. When it comes to higher normal forms the
picture becomes a little more complex, particularly when it comes to fifth
normal form which involves Cross Join Functional Dependencies over more than
one table. If you concentrate on getting a good grasp of the principles of
the first 3 forms, however, and test your table definitions against these
principles you'll be on pretty safe ground for the vast majority of cases.

Finally let me give a strong endorsement to what Tina said about spending
time on the design of the tables and the relationships between them. Sit
down with a good supply of pencils and paper and draw the design out
diagrammatically, posing questions as to its being a valid model of that part
of the real world with which it is concerned. Only when you are happy with
the logical model should you start creating the tables themselves. If you get
the logical model right the interface will fall into place easily; get it
wrong and you'll end up jumping through hoops for ever more.

Ken Sheridan
Stafford, England
 
R

Robin

Thank you very much Ken for such a comprehensive answer. I think I do now
understand Functional Dependency and have managed to apply this to my table
structures.

I actually found rather than pencil and paper, that the Access Relationships
tool was very useful. I created initial tables very roughly as I thought
they might be, and then played around with relationships, modifying tables
as I went and adding new tables to join many-to-many relationships, until
the structure and relationships modelled the real world. I think I've also
managed to avoid redundant data and possible update/delete anomalies along
the way. I then fed in a few rows of dummy data to each table and played
around with the result - seems to work!

If it's laid out in the right way, the Relationships tool actually helps
reinforce the ideas of process and flow - there's a certain elegance in the
final result.

Now, I have a different problem altogether. Maybe I should post a new
question but just a quick overview...

I may have all the right relationships but I can't work out the best way to
display this. My main form is (I think correctly) bound to a table with
many foreign keys, but who wants to look at key values. I can bind the form
to a Query but this is obviously read only, I can create Combo Boxes to
display related tables as row source (but I don't actually need Combo
Boxes).

A flat structure is very simple - Text Boxes work just fine! (I did try
DLookup but the calculations are painful and all I'm doing is coding to
calculate relationships that already exist!)

I'm missing something. Should I use a Query and have a different interface
for edits?

Regards,

Robin
 
R

Robin

Thank you very much Ken for such a comprehensive answer. I think I do now
understand Functional Dependency and have managed to apply this to my table
structures.

I actually found rather than pencil and paper, that the Access Relationships
tool was very useful. I created initial tables very roughly as I thought
they might be, and then played around with relationships, modifying tables
as I went and adding new tables to join many-to-many relationships, until
the structure and relationships modelled the real world. I think I've also
managed to avoid redundant data and possible update/delete anomalies along
the way. I then fed in a few rows of dummy data to each table and played
around with the result - seems to work!

If it's laid out in the right way, the Relationships tool actually helps
reinforce the ideas of process and flow - there's a certain elegance in the
final result.

Now, I have a different problem altogether. Maybe I should post a new
question but just a quick overview...

I may have all the right relationships but I can't work out the best way to
display this. My main form is (I think correctly) bound to a table with
many foreign keys, but who wants to look at key values. I can bind the form
to a Query but this is obviously read only, I can create Combo Boxes to
display related tables as row source (but I don't actually need Combo
Boxes).

A flat structure is very simple - Text Boxes work just fine! (I did try
DLookup but the calculations are painful and all I'm doing is coding to
calculate relationships that already exist!)

I'm missing something. Should I use a Query and have a different interface
for edits?

Regards,

Robin
 
P

Pat Hartman\(MVP\)

"I can bind the form to a Query but this is obviously read only,"

Whoever told you that all queries were not updatable was incorrect.

All your forms and reports should be bound to queries. They allow you to
specify selection criteria and join to lookup tables to select additional
data fields.

Normal procedure is to use comboboxes on forms to select the values for
foreign key fields. The combos allow you to display and select the text
value but store the numeric ID value.
 
P

Pat Hartman\(MVP\)

"I can bind the form to a Query but this is obviously read only,"

Whoever told you that all queries were not updatable was incorrect.

All your forms and reports should be bound to queries. They allow you to
specify selection criteria and join to lookup tables to select additional
data fields.

Normal procedure is to use comboboxes on forms to select the values for
foreign key fields. The combos allow you to display and select the text
value but store the numeric ID value.
 
G

Guest

Robin:

Queries can be updatable or non-updatable depending on the circumstances.
As Pat has said foreign key columns generally use a combo box on a form.
Lets say you have three tables Locations, Cities and States. Locations would
have a CityID long integer number foreign key column, and Cities could
similarly have a StateID foreign key column (though as US states all have
unique names you could in that case use a 'natural' foreign key State of text
data type).

A form for entering data would be based on a query such as:

SELECT *
FROM Locations
ORDER BY Location;

The CityID column's control on the form would be a combo box with a
RowSource, assuming the use of a numeric StateID foreign key column in Cities:

SELECT CityID, City, State
FROM Cities INNER JOIN States
ON Cities.StateID = States.StateID
ORDER BY City;

Its ColumnCount property would be 3, its BoundColumn 1, its ColumnWidths
something like 0cm;3cm;3cm (or rough equivalent in inches) and its ListWidth
6 cm (the sum of the columnwidths) Experiment with the dimensions to get the
best fit, but the first dimension of the ColumnWidths property must be zero.
To display the State when a city is selected add an unbound text box to the
form with a ControlSource of:

=cboCities.Column(2)

where cboCities is the name of the combo box. The Column property is
zero-based, so Column(2) refers to the third column (State). When a city is
selected from the combo box the name of the city will show in the control,
but its value will be that of the hidden CityID column. The state for the
selected city will show in the unbound text box.

Ina form such as the above the tables are in a many-to-one relationship,
Locations to Cities. Where the form uses the reverse relationship, i.e. the
form's underlying table relates one-to-many to another table then the usual
way to enter data into the referencing (Many-side) table is via a subform
embedded within the parent form and linked to it on the key columns. A form
of Customers with a subform of Orders would be an example, the orders subform
being linked to the parent customers form on the CustomerID columns in each
table. The linking is done via the LinkMasterFields and LinkChildFields
properties of the subform control, but if you create the subform and main
form and then drag and drop the subform from the database window onto the
main form in design view the values for these properties should be entered
automatically. I would normally use a continuous form as the subform, in
such circumstances.

Ken Sheridan
Stafford, England
 
G

Guest

Robin:

Queries can be updatable or non-updatable depending on the circumstances.
As Pat has said foreign key columns generally use a combo box on a form.
Lets say you have three tables Locations, Cities and States. Locations would
have a CityID long integer number foreign key column, and Cities could
similarly have a StateID foreign key column (though as US states all have
unique names you could in that case use a 'natural' foreign key State of text
data type).

A form for entering data would be based on a query such as:

SELECT *
FROM Locations
ORDER BY Location;

The CityID column's control on the form would be a combo box with a
RowSource, assuming the use of a numeric StateID foreign key column in Cities:

SELECT CityID, City, State
FROM Cities INNER JOIN States
ON Cities.StateID = States.StateID
ORDER BY City;

Its ColumnCount property would be 3, its BoundColumn 1, its ColumnWidths
something like 0cm;3cm;3cm (or rough equivalent in inches) and its ListWidth
6 cm (the sum of the columnwidths) Experiment with the dimensions to get the
best fit, but the first dimension of the ColumnWidths property must be zero.
To display the State when a city is selected add an unbound text box to the
form with a ControlSource of:

=cboCities.Column(2)

where cboCities is the name of the combo box. The Column property is
zero-based, so Column(2) refers to the third column (State). When a city is
selected from the combo box the name of the city will show in the control,
but its value will be that of the hidden CityID column. The state for the
selected city will show in the unbound text box.

Ina form such as the above the tables are in a many-to-one relationship,
Locations to Cities. Where the form uses the reverse relationship, i.e. the
form's underlying table relates one-to-many to another table then the usual
way to enter data into the referencing (Many-side) table is via a subform
embedded within the parent form and linked to it on the key columns. A form
of Customers with a subform of Orders would be an example, the orders subform
being linked to the parent customers form on the CustomerID columns in each
table. The linking is done via the LinkMasterFields and LinkChildFields
properties of the subform control, but if you create the subform and main
form and then drag and drop the subform from the database window onto the
main form in design view the values for these properties should be entered
automatically. I would normally use a continuous form as the subform, in
such circumstances.

Ken Sheridan
Stafford, England
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top