Attention Ken Sheridan

  • Thread starter dave@homedeliverygroup
  • Start date
D

dave@homedeliverygroup

Dave:

It seems to me that you have two entity types here, one being the claim,
i.e. the 'case' as a whole, the other being the works associated with a
claim. The claim will have attributes such as the policyholder, policy
number etc, i.e. those attributes which are specific to the claim as a whole.
The works will have attributes such as the type of work, its costs etc and
these will be specific to each set of works rather than the claim a whole.

In a relational database entity types are modelled by tables and their
attributes by the columns (fields) of the tables. So you would have a claims
table with columns for its attributes, and a works table with columns for its
attributes, the two being related in a one-to-many relationship by means of a
foreign key ClaimNumber column in the Works table referencing the primary key
ClaimNumber of the Claims table.

You are right to think in terms of a subform, but you would have only one, a
works subform within a claims form, the two linked on ClaimNumber. All
works, including those associated with the initial claim, would be entered as
records in the subform. Consequently when claim 6582188 is retrieved all the
works associated with it would be shown in the subform and as many additional
works as necessary can be added at any time simply by inserting another
record in the subform.

As it sounds like you have everything in the one table at present you will
need to 'decompose' it into the two tables. This is very simple to do.
having created the Works table you then create an 'append' query which
inserts rows into the works table from your current table. You'd append the
values of the CaseNumber column along with the values from the other columns
which are attributes of the works, and which are now represented by columns
in the works table. You can then create a relationship between the tables,
enforcing referential integrity, cascade updates (the latter to cater for a
claim number being changed after its associated works record(s) had been
entered – this might merely be the correction of a simple error by the user),
and, if appropriate, cascade deletes (this ensures that if a claim record is
deleted the works records associated with it are also deleted).

Once you are satisfied that the works table is correctly populated you can
delete the now redundant columns from the original table.

Finally you'd redesign your form to reflect the tables, deleting the
redundant bound controls from the form and adding the works subform.

You mention that the ability to add further works is only one of the
enhancements proposed. Be sure to look at the totality of these when
considering the modifications to the database which are needed as there may
well be further entity types involved and further relationships with existing
or yet to be created tables. The basis of a solid relational database is
that it is a 'logical model' of the real world entities and the relationships
between them. Getting the logical model right is the key to success. Do so
and the interface will fall into place naturally, but get it wrong and you'll
end up constantly jumping through hoops to get round the defects in the
model. I'd strongly endorse Jeff's advice that you take time to become
familiar with the principles of the database relational model before getting
too deep into the application design.

Ken Sheridan
Stafford, England

Ken thanks for you above reply you are correct i have all the info in one
table as follows...

one table name is hdg_claimTBL

1. today_date
2. hdg_ticket_number (primary key)
3. claim_status (from drop down list)
4. date_of_occurence
5. market_hub (from drop down list)
6. store_and_location (from drop down list)
7. checkpayableto (from drop down list)
8. claimType(from drop down list)
9. original_invoice
----------------------------------------------------------------
10. first_name
11. last_name
12. address
13. second_address
14. apartment_number
15. city
16. state
17. zip_code
18. telephone
19. alt_telephone
------------------------------------------------------------------------------
20. contractor_name
21. driver_name
22. deduction_amount
23. repayment_amount
24. number_of_weeks
25. description_of_claim
------------------------------------------------------------------------------
26. sku_number
27. model_number
28. merch_description
29. item_retail_cost
30. item_cost
-----------------------------------------------------------------------------
31. year
32. make
33. model
34. vin_number

as you can see it is a very larger tabel(34 items)... as you stated... now
according to "normalization" there are no duplicative columns unrelated to
the ticket number, however i do see where i can separate the table down into
five different tables of related data as i serperated them with the dashed
lines... does that make more sense? I also realize now that i can't make the
ticket number the primary key...
 
K

Ken Sheridan

Dave:

Simply breaking out subsets of columns (fields) into separate tables is not
usually enough. You have to identify the entity types involved in that part
of real world which the database is modelling, their attributes (which must
be specific to the entity type, what's known in the jargon as 'functionally
dependent'), and the relationships between them. Each entity type will be
represented by a table and its attributes by columns in the table. The
relationships will be represented either by a foreign key column in one table
referencing the primary key of another table ( a many-to-one relationship
type), or by another table with foreign keys which reference the primary keys
of two or more other tables (a many-to-many relationship type).

Looking at your partitioning of the columns it looks to me like what you
have in terms of entity types here along these lines:

Claims (1 -9)
Customers (10-19)
WorkProgramme (20 – 25)
Materials (26 -30)
Vehicles (31 – 34)

There are a few details of these which I can see at first sight need
attention. For instance you have both a city and state column for customers.
You only need the city (in fact a numeric CityID as city names can be
duplicated). The Cities should be in a separate referenced Cities table, and
this in turn should reference a States table. Otherwise you are introducing
redundancy as it would be possible for San Francisco to be in California in
one customer's address and erroneously in New Hampshire or any other state in
another. You might think that's unlikely to happen, but Murphy's Law states
otherwise!

You also have two separate address columns. This is bad design; you should
have a separate Addresses table which is references the Customers table, so
each address would be a separate row in the addresses table. The same
applies to telephones.

These however are just generic design faults. The important process is to
analyze the model so that you are confident each table does represent a real
world entity type, its columns represent the attributes specific to that
entity type, and the relationships represent the real world relationships
between them. Use your current table as a guide to this process , but don't
be constrained by it. Its important that the model fits the real world, and
you should not try and squeeze the real world into a preconceived model.

I'd need to study your business model in a lot more detail than space or
time here allows to come up with a detailed set of tables and relationships,
but on the whole you seem to have the basis for decomposing the table without
too many problems. The key is to first understand how the database
relational model works in principle. You'll find plenty online if you Google
around a bit, and there are many books on the subject, but to give you a
flavour of things here are two of my 'stock' summaries, one on the relational
model, the other on normalization. They are a simplified statement of these
topics of course, and there is duplication between the two, but hopefully
they'll give you an inkling:

1. The database relational model was originally proposed by E F Codd in a
paper in 1970 in the journal 'Communications of the Association for Computing
Machinery'. Since then there has been a vast amount of further theoretical
work, and the relational model has shown itself to be a robust one. Without
going too deeply into the theoretical basis, which can be quite abstract, a
relational database in essence models a part of the real world in terms of
its entity types and the relationship types between them. Note the inclusion
of the word 'type' in both cases here. While its almost always used in the
former case, its often omitted in the latter case. This is a little bit
sloppy but not too important. When one talks about a 'relationship' it
really refers to a relationship value. As an example 'marriage' is a
relationship type, but my being married to my wife Fiona is a relationship
value, represented by our names on the marriage certificate, which is the
physical equivalent of a row in a Marriages table with columns Husband and
Wife, each referencing the primary key of a table People. This is a
many-to-many relationship type (I've been married twice so would be in two
rows, my first wife would also be in two rows as she remarried too). It is
resolved into two one-to-many relationship types, People to Marriages in each
case, in one case via the Husband column in the other via the Wife column.

In a relational database tables model Entity Types. In the above example
People is an entity type, modelled by the People table. Marriage is also an
entity type, modelled by the Marriages table. As we've seen its also a
relationship type. In fact a relationship type is just a special kind of
entity type.

Each column in a table represents an attribute type of each entity type, so
attribute types of People might be FirstName, LastName, DateOfBirth etc.
This table would also have a PersonID numeric column (usually an autonumber)
as its primary key as names are not unique. Each row in a table represents
one instance of the entity type, and the attributes of each instance are
represented by values at column positions in the row. This is the only way
that data can be legitimately stored in a relational database.

Its important that there is no redundancy in the information content of the
database. This is achieved by the process of 'normalization'. Normalization
is based on a set of 'normal form's ranging from First Normal Form (1NF) to
Fifth Normal Form (5NF). There is also one called Boyce/Codd Normal Form
(BCNF) which was inserted when it was found that the original Third Normal
Form was deficient; it didn't cater satisfactorily for tables with two or
more candidate keys where the keys were composite and overlapped, i.e. had a
column in common. I won't go into the details of normalization here; you'll
find it written up in plenty of places.

To see an example of redundancy and therefore a table which is not properly
normalized take a look at the Customers table in the sample Northwind
database which comes with Access. You'll see that it includes City, Region
and Country columns. If you look at its data you'll se that we are
redundantly told that São Paulo is in province SP which is in country Brazil
4 times. This is not just inefficient, it is dangerous as it leaves the
table open to inconsistent data being entered. There is nothing to stop
somebody putting São Paulo in the UK, USA or in each in separate rows in the
table for instance. To normalize the table it should be decomposed into
Customers, Cities, Regions and Countries tables, each of the first three with
a foreign key referencing the primary key of the next table up in the
hierarchy.

2. Normalization is the process of eliminating redundancy from a database,
and involves decomposing a table into several related tables. In a
relational database each table represents an entity type, e.g. Contacts,
Companies, Cities, States etc. and each column in a table represents an
attribute type of the entity type, e.g. ContactID, FirstName and LastName
might be attribute types of Contacts and hence columns of a Contacts table.
Its important that each attribute type must be specific to the entity type,
so that each 'fact' is stored once only. In the jargon its said that the
attribute type is 'functionally dependent' solely on the whole of the primary
key of a table.

To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.

Similarly the Companies table might have a CityID column and the Cities
table a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in) a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.

To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of inconsistent data, e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA.

An example of what at first sight might seem to be redundancy, but in fact
is not, can also be found in Northwind. The Products table and the
OrderDetails table both have UnitPrice columns. It might be thought that the
unit price of a product could always be looked up from the Products table, so
its unnecessary in Order Details. However, the unit price of a product will
change over time, but each order needs to retain the price in force at the
time the order was created. Consequently a UnitPrice column is needed in
both tables; that in products holds the current price and is used to get the
value for that in Order Details (code in the ProductID control's AfterUpdate
event procedure in the Order Details Subform does this), which then remains
static when the current price (in products) changes. In each case UnitPrice
is functionally dependent on the key of the table, so there is no redundancy.

Ken Sheridan
Stafford, England
 
J

Jeff Boyce

Dave

In addition to Ken's suggestions, think twice about how you handle City.
The same city name (e.g., "Paris") can happen in more than one
state/province/country.

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
D

dave@homedeliverygroup

Ken,
First off thanks for the time that you have put into your replies...


the one reason why i had put it into one table is that even though some of
the fields are not used they are all printed out on a report...

(BTW we are a delivery company for Lowes)

You have broke down my table this way
Claims (1 -9)
Customers (10-19)
WorkProgramme (20 – 25)
Materials (26 -30)
Vehicles (31 – 34)

you were almost correct in your description 1-9 is the claim info and what
hub it was out of(our hubs are mainly on the east coast) as well as what
store and the loacation of that store, which is a drop down(storesTBL) list
containg all our store information, 10-19 is the customers info where the
damage happened. 20-25 is actually the delivery contractor that caused the
damage, how much he needs to repay(as we pay off all the claims then deduct
from our contractors), the number of weeks the payoff will be deducted till
paid back in full and a brief description of the damage... 26-30 is actually
info for a merchadise damage claim if our contractor damages the merchandise
then the cost to replace the damaged merchandise would also go into the 20-25
section... 31-34 is if the contractor has an accident involving another auto,
or hits something with the delviery truck again the cost to pay back would go
into the 20-25 section. So section 1-9 is always entered as is 10-19, and
20-25... 26-30 only if merchandise damage, and 31-34 only if an accident has
occured.

I am not sure what you mean by the city and state and having a seperate city
and state table as we deliver to hundreds of citys across several states i do
see what you mean as the same city could be entered in New York as well as
Pennsylvania. Would that mean i would have to create a city/state table and
enter the info or create them and let the data entry build the tables????

Also the addresses and telphone numbers you mentioned to have a separate
addresses table that references the customers table is way beyond me...more
reading on that for me.

Currenlty there is not another person in our company that has any Access
experience other than myself and i only had it when i was in college which
was a bit of time ago... so i am moving forward myself with plenty of
reading... the DB that i have created will work for the time being, however i
want to be able to enhance, make changes and make it better functional wise i
see the flaws in it now. for instance i have all 200 of our stores in one
table... LOL one table again... must be my way of thinking... i tried to get
it to work where when a person entered the market_hub the store_and_location
only listed the stores for that hub... sounded simple, but did not work... we
have currently 10 hubs so i had 10 stores tables which made sense but then i
just entered all the info into one table... but as i read and re-read your
"stock" summaries somethings are coming to light...

dave
 
K

Ken Sheridan

Dave:

Lets first deal with the Cities issue as that's just a question of basic
normalization. I can illustrate it best with an example. Firstly with a
table Contacts which has columns for City and State as well as ContactID
FirstName, LastName, AdddreesLine1, AddressLine2 etc, we might have rows with
data like this (excluding some columns for simplicity)

1 John Brown San Francisco California
2 Jane Green San Francisco New Hampshire

Jane Green's row is obviously incorrect because we all know which state San
Francisco is in, but there is nothing in the design of the table to prevent
this incorrect data being entered, nor would a visitor from Mars know which
is the correct row or would assume that there are cities called San Francisco
in both states. This is because the table is not properly 'normalized'. Its
actually not normalized to Third Normal Form (3NF), which requires that all
non-key columns in the table must be functionally dependent solely on the
whole of the table's primary key.

In the table The FirstName, LastName and City fields do meet this criterion
because the sole determinant for them is Contact ID. The State column
however is not determined solely by ContactID, but also by City (assuming,
albeit wrongly, for the moment that all city names are unique) as once we
know that the city is San Francisco we know that the contact is in
California, or we would in a correctly designed database.

The solution is to 'decompose' (yes, that is the term used) the table into
therr normalized tables, Contacts, Cities and States. Cities will have
columns CityID (because city names are duplicated we can't the name as the
key), City and State. The States column needs just the one column State (or
two if you want one for the abbreviation and one for the full name of the
state). The State column would be the key as state names are not duplicated,
so we don't need a 'surrogate' numeric key (though some people prefer to use
one anyway). So the tables would now look like this:

Contacts:
1 John Brown 42
2 Jane Green 42

Cities:
42 SanFrancisco CA

States:
CA California

In the relationships between States and Cities referential integrity would
be enforced, which means only a valid state can be entered in Cities, and
cascade updates would also be informed as theoretically a state's abbreviated
name could be changed, so changing it once in States would automatically
change it in the matching rows in Cities. In the relationship between Cites
and Contacts referential integrity would be enforced, but as CityID is
probably an autonumber whose value can't be changed then its not necessary to
enforce cascade updates.

You can enter all the state in the states table in one go of course, and
probably many of the cities in the city table, but when you need to enter a
city in the contacts table which doesn't yet exist in the Cities table you
need to enter the latter first. This is something you'd usually build in via
the interface so that when entering a contact in a contacts form you can pop
up a form to enter a new city before completing the entry in the contacts
form.

With this set of normalized tables you can see that the same city can't be
mistakenly put in two different states as the only place where we are told
which state its in is in one row in the Cities table. If by any chance the
visiting Martian was right in thinking that there are San Francisco's in both
states (probably not, but I know there are 4 Staffords in the USA because we
are twinned with them, so it’s a real possibility) then ther would be two
rows in Cities , both with San Francisco in the City column, but with
different CityID and State values, and the value in CityID in Contacts would
be that for whichever of the two is the correct one.

BTW you'll find a demo of how to handle this sort of data via correlated
combo boxes on forms at:


http://community.netscape.com/n/pfx...yMessages&tsn=1&tid=23626&webtag=ws-msdevapps


It uses the local administrative areas of Parish, District and County in my
area, but the principle is exactly the same.

Your statement "the one reason why I had put it into one table is that even
though some of the fields are not used they are all printed out on a report"
illustrates another generic point. One thing to remember is that by 'table'
we don't just mean a 'base table' i.e. a real stored table; it also means
the 'result table' of a query, so for a report you'd join the necessary
tables in a query and use that as the RecordSource of the report, which would
be just like using a single base table. If we joined the Contacts, Cities
and States tables in the above example for instance we'd end up with a Result
table which is exactly the same as the original single table before we
decomposed it, but without the scope for inconsistent data which the single
table allowed. BTW as an example of these sort of errors in a real life
scenario I once found three versions of my name as author of technical papers
in my own field (environmental planning) in one database. As far as the
database is concerned I'm three separate people!

Applying these principles to your own database is going to be down to you;
we can point you in the right direction but we don't have the insider
knowledge of you business model to be able to tell you exactly what you need
in the way of tables and relationships. Lets look at one or two points of
detail, though, which might help illustrate how the principles will be
applied in your case:

"I have all 200 of our stores in one table"

That sounds like how it should be. 'Stores' is an entity type and will have
attributes such as its address data, so each row in the table will represent
one store and each column position in a row will represent the attribute
value, e.g. a sore in Boston Mass. (where my grandmother was brought up
incidentally) might have a CityID value of 21 where 21 is the value in the
CityID column of the row in the Cities table for Boston.

"I tried to get it to work where when a person entered the market_hub the
store_and_location only listed the stores for that hub"

That's what the demo I gave the link for above does in fact. Firstly,
though, you need to identify the relationships. There are several
possibilities:

1. Each store relates to one hub only, but two stores in one City say,
could relate to different hubs. In this case the relationship is a simple
many-to-one relationship from Stores to Hubs, so you'd have a Hub (or HubID)
foreign key column in the Stores table referencing the primary key of the
Hubs table.

2. Alternatively all stores in one City could relate to one Hub, and each
Hub could cover more than one City. In this case the relationship is bwteen
Cities and Hubs so you'd have a Hub (or HubID) foreign key column in the
Cities table referencing the primary key of the Hubs table. There'd be no
Hub or HubID column in the Stores table as once you know which city its in
you know which hub it relates to.

3. Another possibility could be the same as 2 but all stores in one State
could relate to one Hub, in which case you'd have a Hub (or HubID) foreign
key column in the States table referencing the primary key of the Hubs table.
Again there'd be no Hub or HubID column in the Stores table as once you know
which state its in you know which hub it relates to.

4. Over here we tend to use Post Codes (equivalent of Zip codes) for this
sort of thing, and once a post code is entered you know not only which city
or town and county (we don't have states of course) but which street and
which part of the street an address is in. My post code for example shows
that I am on the east side of my street. You could have something similar as
regards hubs in your case you'd have a Hub (or HubID) foreign key column in
a ZipCodes table referencing the primary key of the Hubs table. Again
there'd be no Hub or HubID column in the Stores table as once you know a
store's Zip code you know which hub it relates to.

I hope that gives you some idea of the sort of questions toy need to ask
yourself when setting up your tables. I think the ball is now in you court
to come up with a model, but I'd strongly recommend that you draw it out on
paper forts before creating the tables and relationships, using boxes for
each table and arrowed lines between then for the relationships (rather like
you see in the relationships window in Access). This is called, not
surprisingly, an entity relationships diagram. Mentally test the paper model
as you go along by rigorously asking yourself does it accurately represent
your real world business model. You'll find this, like the prospect of
execution, not only concentrates the mind wonderfully, but will save you a
lot of headaches later when you come to build the database itself. Getting
the 'logical model' right at the outset is the real key to a robust and
efficient database; the interface will follow naturally from the model if its
right, but will be a PITA to design if its wrong.

Ken Sheridan
Stafford, England
 
D

dave@homedeliverygroup

Thanks again Ken... you are very helpful... this is a start... i also checked
out the Northwind DB some nice things in there as well...

Started to setup my tables on paper as suggested...

PS are there any good "novice" type books you may suggest that are good?
 
J

Jeff Boyce

Ken

I can only locate one city named "San Francisco", and it is in California.

But if you look up the city named "Salem", there's one in Oregon, one in
Illinois, one in Massachusetts, one in ...

Going only by city name is NOT guaranteed to provide unique cities.

(or maybe I'm mis-interpreting your description...)

Regards

Jeff Boyce
 
J

John W. Vinson

Ken

I can only locate one city named "San Francisco", and it is in California.

But if you look up the city named "Salem", there's one in Oregon, one in
Illinois, one in Massachusetts, one in ...

Going only by city name is NOT guaranteed to provide unique cities.

(or maybe I'm mis-interpreting your description...)

Fully agreed. I've read that there is (or at least was at one time) a
Springfield in each of 45 states. And as for San Francisco...
http://www.mapquest.com/maps?city=San+Francisco&state=NM

Not quite as big or rich a place, but it's a place!
 
K

Ken Sheridan

Jeff:

I obvious wasn't clear enough. That's exactly what I was saying. I
invented the hypothetical San Francisco in New Hampshire to illustrate that
my first non-3NF table might not necessarily have been incorrect in terms of
its content. The point I was making was that our Martian visitor would not
know for certain whether there are two San Franciscos or not from the
original table, but if its decomposed and there were two San Francisco rows
in Cities with different CityID and State values then it would be clear that
there are two, hypothetically speaking.

Somewhere along the line I pointed out that there are four Stafford's in the
USA, all presumably originally named after this little one horse town where I
am.

Ken Sheridan
Stafford, England
 
D

dave@homedeliverygroup

Hey Ken i picked up MS Office Access 2003... since that is the version my
company is using ... thanks for the head start...
 
D

dave@homedeliverygroup

Ken if you read this again... i can send you a copy of the DB you could then
see the business aspect of it... it is a simple DB...

(e-mail address removed) is my email... address...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top