DB Normalization

J

John Vinson

Well, I am certainly relieved to hear that it may not be necessary to change
the PK. The DHSNo is a unique field and there can be no duplicates (what a
nightmare that would be for the court clerk). The reason I went down this
bunny trail in the first place is that I guess I misunderstood some of the
posts below in response to my earlier post below entitled: Data
Normalization Dispute. Specifically, poster "Rolls" who says: " The
reason for not making the case number the PK is that this field HAS MEANING!
The instant that a key field has meaning becomes the instant that it should
NOT be a primary key." So do I assume that this is some sort of ongoing
debate between professionals?

Well, I certainly would disagree with Rolls on this point. If there is
a good natural key (such as a Case Number, or the two-letter
abbreviation of a state or province in a States lookup table, or the
like) - I'll use it, even though "ID" and "AK" in fact have meaning.
I haven't made any changes to my original db at this point. I was only
fooling with copies. From the little bit that I did, I quickly realized
that changing the PK at this point would be a nightmare and I do NOT want to
do it if it is not necessary.
Agreed.

Also, PC kept asking me if I analyzed the tables. What specifically am I
looking for? PC stated that I should do an analysis because of the
one-to-one relationships between a lot of the tables. But if I lump all the
tables together, I get a gigantic table with many, many fields. This was
the reason I split the data up into logical groups -Record requests,
Discovery Sent, Disovery Received, Hearing, Status, Tasks.

I'll browse through Cheryl's comments downthread (always eager to
learn something new!) but at first glance these are probably NOT
one-to-one relationships. Surely one case will have *many* Requests,
Discoveries, Hearings? How are these one-to-one child tables
structured? Do you still have repeating fields for Hearing1, Hearing2,
Hearing3 and the like? If so - you're storing a one to many
relationship in each record.

But I'll quit here and go read the rest of the thread...
 
R

Rolls

Sure, if you have a table with few records like a two character field,
Access will let you use "AK" (a text field) as a PK and it'll (sort of)
work.

B-b-b-BUT! for a large number of records there will be a size and
performance penalty, and code is less reusable, than if you follow one
pattern for all tables instead of using a haphazard table design.

An "Intelligent key" is an oxymoron. Sorry.
 
S

S. Jackson

Rolls said:
There appears to be a conceptual problem (or two) here.

Your "entity" tables must contain PK fields. Relationship tables refer to
those keys as FK fields. tblPeople contains records, each with a PK to
denote one unique person, each. As long as that person exists in the
database, that PK field should never change. The PK field typically is an
Autonumber field, is automatically assigned, and is not seen by the user.
It's used to link tables PK <-> FK. The fact that you have multiple joined
tables is due to the fact that you have multiple entities and relationships
in your model. There is not a 1 : 1 relationship between fields that are
not within the same table.

Yes, I agree -a conceptual problem here. I think perhaps I haven't been
able to properly articulate how I have designed the db and that perhaps some
are not familiar with legal practice.
If you're thinking in terms of changing key values, this probably indicates
that your model isn't correct; you haven't normalized your tables.

The only reason I am thinking about changing key values is because of your
statement in the discussion below:

After reading others post, it appears that it is not necessary.

In terms of using an Autonumber PK instead of a "case number" (which may be
unique, too) the autonumber is an integer which takes less space and runs
faster than a text field used as a key. If the text field has meaning, and
the meaning changes, you've just destroyed referential integrity and have to
rebuild keys and relink fields, which is unnecessary with good design. In
your DB the "case number" (field) is as attribute of the "case" (record)
which the Autonumber key field denotes as "unique".
How can the meaning change? I don't understand what that means. A case
number is a case number. It is unique. Once assigned by the court, it
stays that way until the case is closed or consolidated with another. If
the case is consolidated, a new record is created and the old record is
"closed."
Stop trying to complicate something that is simple.

I am sorry if you seem to think that I am complicating the matter. However,
I respectfully disagree with the fact that it is ME that is complicating the
matter. I think what has happened here is that there are too many "cooks in
the kitchen" and I am becoming more and more confused by each response. BUT,
please everyone understand, I very much appreciate ALL of your help and I
have been very careful to state that in each and every post I have made.

I think enough has been said with regard to this string and we should all
move along. :)
 
S

S. Jackson

Sorry PC - too many posts in this thread. I did go looking for that post
before I posted the question, but didn't find it in the mix.

Remember though, I have said "please" and "thank-you" and "excuse my
ignorance."

Let's all move on, shall we? :)

S. Jackson
 
J

John Vinson

B-b-b-BUT! for a large number of records there will be a size and
performance penalty, and code is less reusable, than if you follow one
pattern for all tables instead of using a haphazard table design.

Well, I'll still choose to disagree. A seven-byte Text field properly
indexed will be just as fast as a four-byte Long, with only trivial
overhead; the code should not depend on the value of the PK in any
case; and I don't see having two functional "primary keys" as being
any less haphazard than using a natural key would be.

But I do agree that this is a contentious point about which reasonable
people can disagree! The database can be made to work correctly either
way (natural key or artificial key), and a lot depends on your
approach to how to think databases through. "There are nine and ninety
ways/of constructing tribal lays/and EVERY SINGLE ONE OF THEM IS
RIGHT!" (Kipling)
 
C

Cheryl Fischer

I'll browse through Cheryl's comments downthread (always eager to
learn something new!)

Oh, pish! <g>

--

Cheryl Fischer, MVP Microsoft Access
Law/Sys Associates, Houston, TX
 
M

Maya

I am with Rolls 125%!!

The following statement you made supports that he is correct --
<<Once assigned by the court, it stays that way until the case is closed or
consolidated with another. If
the case is consolidated, a new record is created and the old record is
"closed.">>

If a case starts out as 02-3456-K and is consolidated with another case and now
becomes 04-7755-K, the reference to all previous foreign keys is lost and all
the related records in the other tables become orphaned. If an autonumber is
used for the primary key in TblCaseInfo, consolidating the case with another
case simply means you go into TblCaseInfo and change the DHSNo to the new
number, nothing else needs changed!

For those proponents of using the DHSNo for a primary key ---
The prime directive of database design is that any data item should appear only
once in the tables of a database. DHSNo is a data item and as such should appear
only once in the tables. Using the DHSNo as a primary key causes this data item
to appear multiple times throughout the tables and consequently violates the
cardinal rule of good database design. The issue of the possibility of
consolidating a case clearly illustrates why it is imperative to follow this
prime directive. Yes, you may be able to put bandages on the database when this
happens, but had you followed the prime directive inthe first place you would
have no scraped knees.

Rolls - well done!

Shelly, you would be well advised to follow Rolls recommendation!

Steve
PC Datasheet
 
R

rkc

In terms of using an Autonumber PK instead of a "case number" (which may be
unique, too) the autonumber is an integer which takes less space and runs
faster than a text field used as a key. If the text field has meaning, and
the meaning changes, you've just destroyed referential integrity and have to
rebuild keys and relink fields, which is unnecessary with good design. In
your DB the "case number" (field) is as attribute of the "case" (record)
which the Autonumber key field denotes as "unique".

An autonumber is a long.

Do you have a reference for the "fact" that a numerical key "runs faster"
than a text key? Are they indexed differently or something?
 
R

rkc

Maya said:
I am with Rolls 125%!!

The following statement you made supports that he is correct --
<<Once assigned by the court, it stays that way until the case is closed or
consolidated with another. If
the case is consolidated, a new record is created and the old record is
"closed.">>

If a case starts out as 02-3456-K and is consolidated with another case and now
becomes 04-7755-K, the reference to all previous foreign keys is lost and all
the related records in the other tables become orphaned. If an autonumber is
used for the primary key in TblCaseInfo, consolidating the case with another
case simply means you go into TblCaseInfo and change the DHSNo to the new
number, nothing else needs changed!

The old record is closed, not updated. If you update the DHSNo then
the knowledge that the original case number exists disappears. That doesn't
sound desirable to me.

Seems to me that what is missing is a way to relate the old case number to
the
new case number. But I have no more idea of the real requirements of the
database then you do.
 
P

PC Datasheet

<<So far, I have been arguing against making the PK change because of the
nightmare it is going to be.>>

What nightmare??

In my third response in this thread I gave you an 11 step process to make the PK
change. If you analyzed your tables like I explained how to do immediately
before the 11 step process and settled on keeping your original tables, you only
had 6 tables to do and could have done it in less than half the time you spent
on this thread.

So I repeat, What nightmare??

<<I fully expect that sh** is going to happen >>
May I add "if I use the DHSNo for a primary key"

Steve
PC Datasheet
 
S

S. Jackson

Maya said:
I am with Rolls 125%!!

The following statement you made supports that he is correct --
<<Once assigned by the court, it stays that way until the case is closed or
consolidated with another. If
the case is consolidated, a new record is created and the old record is
"closed.">>

If a case starts out as 02-3456-K and is consolidated with another case and now
becomes 04-7755-K, the reference to all previous foreign keys is lost and all
the related records in the other tables become orphaned. If an autonumber is
used for the primary key in TblCaseInfo, consolidating the case with another
case simply means you go into TblCaseInfo and change the DHSNo to the new
number, nothing else needs changed!

The key word above is that the record (case) is "closed." If 02-3456-K is
consolidated into existing case 04-7755-K, 02-3456-K is given a "closed
date" and disposition field directs the user to the second existing case
04-7755-K. From that point forward the user will only make entries into
04-7755-K. The reason cases are consolidated in a court of law is because
they involve the same parties with same or similar issues or facts = same
data! There is no need to transport 02-3456-K's data into 04-7755-K - at
least for my purposes anyway. The db is for administrative law cases in
only one court, not multiple courts so the DHSno format will always be the
same.
For those proponents of using the DHSNo for a primary key ---
The prime directive of database design is that any data item should appear only
once in the tables of a database. DHSNo is a data item and as such should appear
only once in the tables.

It does.
Using the DHSNo as a primary key causes this data item
to appear multiple times throughout the tables and consequently violates the
cardinal rule of good database design.

To clarify - do you mean multiple times within the same table? Because it
does not - it can only appear once within a table. Or do you mean multiple
times in that it appears once in more than one table? Wouldn't a new
autonumber field appear multiple times throughout the tables too since it
would become a foriegn key? Sorry, you lost me here.
The issue of the possibility of
consolidating a case clearly illustrates why it is imperative to follow this
prime directive.

Answered above.
Yes, you may be able to put bandages on the database when this
happens, but had you followed the prime directive inthe first place you would
have no scraped knees.

I've already stock-piled my supply of bandages and first-aid equipment since
I am inexperienced and untrained. I fully expect that sh** is going to
happen :-D
Rolls - well done!

Shelly, you would be well advised to follow Rolls recommendation!

Thank you Steve. Everyone has certainly provided me with good reasons to do
one thing or the other. So far, I have been arguing against making the PK
change because of the nightmare it is going to be. But for the record, I
have not made up my mind on this issue. Sigh . . .I dunno what to do.

But, hey! Thanks for the debate - I very much enjoyed it. :)
 
R

Rolls

Do you have a reference for the "fact" that a numerical key "runs faster"

I've tested LongInt vs. text (alphanumeric) and found that numeric takes
less disk space and runs 40% +/- faster.

The test was to use either base 10 or base 36 (0 - 9 plus A - Z).

Same for integer vs. hex.

Have not found anything faster than an autonumber field to use as a key
field.
 
R

Rolls

Shirley - I would say that if you are already using your case number
(alphanumeric) and you are absolutely, positively sure that a specific case
number won't be changed, or that a transposition won't ever be entered and
then later corrected, you might be safe using that field as your key. I
won't guarantee it.

You'll also have to store the longer string in the FK fields that reference
your PK case number.

If you want to prove to yourself that it can be done, try it.

My suggestion, though, is to standardize your development "style" to use
autonumber key fields only, and to hide keys from users (they merely denote
"unique" and should have no meaning). Conceptually, concentrate on thinking
in terms of the Entity-Relationship model, the parallel Parent : child
relationships in the data and the corresponding form : subform layout.
Once you've "got it" you'll have established as close to "reuseable code" as
is possible in Access, and will be able to spend more time on data modeling
(thinking about the real-world problem) and less on table design and forms
coding.

I can tell you, however, that 80% of the corporate Access databases I've
been asked to "fix" or enhance have denormalized tables that had to be
restructured before the functionality requested by the owner could be
restored or improved upon. This is due to self-taught, trial and error,
design.

Regarding "case consolidation" you could set up a two field table
"predecessor" and "successor" table. There would be one row for each
consolidation plus one row of each unique case number with itself (same
thing in both fields). This way you could query any predecessor case and
return all successor cases and vice versa. Alternately there is a complex
recursive procedure that would not require as much data. This will let you
navigate through a "tree" data structure where one thing branches into
another.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top