Fixing up structure with primary, natural, surrogate, etc.

F

Fred Boer

Ok, well, I've been reading a few threads about Primary Keys, and the use of
natural vs. surrogate keys. Sadly, a little learning is a dangerous thing,
and I've realized, (yet again.. will it never end!<groan>), that I've got a
problem with the design of my tables. When I created my tables, I just went
along cheerfully whistling to myself and created an Autonumber primary keys
for almost all of my tables. I didn't even bother with unique indexes. Only
now do I realize that I've left myself wide open for duplicate records. So,
I'm trying to fix things...

Questions:

(I have included descriptions of some of my tables and their fields
below...)

1. Using Tbl_Status as an example: There are six possible status codes,
including "On Shelf", "Discarded", "Repair", etc. So, since these will
seldom change, and they shouldn't be duplicates, should I simply have one
field "Status" and make it a natural primary key for the table? If it *were*
the PK, how would that work? I mean, would the main table simply contain the
text of the field "Status" as a FK? This would take more space than the
current Autonumber FK, wouldn't it? And it would be slower than a long
integer, correct? But the advantage would be that I'd never run the risk of
a duplicate status code, is that right? (Obviously, with only six codes,
this is only a theoretical problem, but I'd like to use good design...)

2. Currently, I haven't got primary keys or unique indexes in my junction
tables, (like Tbl_BookSeries). Do I need a primary key for that table? Or
would a unique index on both fields be sufficient? Why or why not?

3. And for a table like "Tbl_Place", would you add new fields (such as
Province/State and Country), and then use a composite primary key made up of
the three fields? This assumes I want to know the difference between
London, England, and London, Ontario, Canada...

Thanks for any suggestions! And I thought I had the structure of
tables/relationships down pat! Argh! I find myself, again, feeling a bit
like I might never get this right! <g>

Fred Boer

P.S. I'm not even sure if I'm using FK and PK correctly at this point...!



Tables using Autonumber PK:


Tbl_Library (Main Table)
BookID - Autonumber
Title - Text
Dewey - Text
etc..
Author - FK to Tbl_Author
Series - FK to Tbl_Series
Status - FK to Tbl_Status
Pubplace - FK to Tbl_PubPlace
etc..

Tbl_Author
AuthorID - Autonumber
Firstname - Text
Middlename - Text
Lastname - Text
PrimaryAuthor - Logical

Tbl_Series (Series Name)
SeriesID - Autonumber
Series - Text

Tbl_Status (Status Code)
StatusID - Autonumber
Status - Text

Tbl_PubPlace (Place of Publication)
Pub_PlaceID - Autonumber
Place - Text



Junction Tables - No PK

Tbl_BookAuthor
BookID
AuthorID

Tbl_BookSeries
BookID
SeriesID

Tbl_BookSubject
BookID
SubjectID
 
T

Tim Ferguson

When I created my tables, I just went along cheerfully whistling to
myself and created an Autonumber primary keys for almost all of my
tables. I didn't even bother with unique indexes. Only now do I
realize

Been there, done that, got the tee-shirt... :)
should I
simply have one field "Status" and make it a natural primary key for
the table?

Two choices (asterisk = PK)

Statuses
========
StatusCode* Description
1 On Shelf
2 Discarded
3 Repair

MainTable
=========
MainID* Status
10238 1
10847 3
etc.

or, to my mind, the simpler

Statuses
========
Description*
On Shelf
Discarded
Repair

MainTable
=========
MainID* Status
10238 On Shelf
10847 Repair
etc.

The advantage of the second version is that you don't have to do joins in
order to read the status value of a record; on the other hand a short
text field is slightly less efficient as a FK than an integer, so updates
are mildly slower -- my guess is you won't notice the delay though. In
effect, the Statuses table only exists in order to constrain the
MainTable.Status field, which is fine.
2. Currently, I haven't got primary keys or unique indexes in my
junction tables,

.... in that case, they are not tables :)
would a unique index on both fields be sufficient?

Yes said:
3. And for a table like "Tbl_Place", would you add new fields (such as
Province/State and Country), and then use a composite primary key made
up of the three fields? This assumes I want to know the difference
between London, England, and London, Ontario, Canada...

Harder question: efficient and correct design here depends very much on
your actual requirements. What is a Place -- what do you need to know
about it, how do Places relate to other things, do you have a table of
Countries or Continents? Do you need to know about Countries or
Continents? Some applications need very fine modelling of geographic
entities; others really come to little more than a spelling aid!

P.S. I'm not even sure if I'm using FK and PK correctly at this
point...!
Looks okay to me
B Wishes


Tim F
 
F

Fred Boer

Dear Tim:

Thanks so much for tackling my questions! Dealing with this has left me with
the uncomfortable feeling that my grasp of the fundamentals of a relational
database is shaky, if not utterly flawed!
Been there, done that, got the tee-shirt... :)
Is it a nice tee-shirt? Where can I order mine? said:
Two choices (asterisk = PK)
Yes... the first example is how I currently have things set up. The second
is what I am attempting to create. To this point, my attempts are fruitless.
Access can't set up relational integrity for some reason... but I am still
working on it. I might try starting from scratch rather than modifying my
existing tables...
... in that case, they are not tables :)

Shall I call them "The Artists formerly known as Prince"; I hear that is
available...?
Yes, this is standard, even amongst the AN diehards <duck /> !

Ok, just to be sure I'm not screwing it up: for my junction tables, I don't
need a primary key, just a unique index based on the two fields, with no
nulls or zero length strings allowed, right?

Thanks, Tim! I'm going back into the tables now... wish me luck!

Fred
 
F

Fred Boer

Dear Tim:

Ok, with your help, I *think* I have at least one working natural key. Along
the way, I discovered a completely unecessary field in my main table, and
some small problems with my data as well. And, of course, I'll need to
change my queries, forms, rowsources for comboboxes... Oh well, I'll tell my
wife I have a *lot* of extra marking to do and stay up really late for a few
nights...

So, thanks! I'm not completely confident I have it licked, so if you
wouldn't mind checking back here over the next few days... I'd appreciate
it! I have a sneaky suspicion I'll be back!

Cheers!
Fred
 
J

John Vinson

Ok, just to be sure I'm not screwing it up: for my junction tables, I don't
need a primary key, just a unique index based on the two fields, with no
nulls or zero length strings allowed, right?

Just to clarify - you DO need a Primary Key; but that Primary Key can
(should) be a two-field Primary Key constituting the two link fields.

John W. Vinson[MVP]
(no longer chatting for now)
 
F

Fred Boer

Thank-you, John! That makes sense. I appreciate the clarification!

Fred Boer

P.S. My earlier messages appear to have disappeared from whatever server I
get my newsgroup messages from... odd.. hope this gets through!
 
T

Tim Ferguson

Yes... the first example is how I currently have things set up. The
second is what I am attempting to create. To this point, my attempts
are fruitless. Access can't set up relational integrity for some
reason...

First thing to check is that the field types match exactly: if one is a
Text(length=12) then the other one must be a Text(12) too. Same applies
to Single-Single, LongInt-LongInt etc. Remember that Autonumber is a
special kind of LongInt (and therefore matches a LongInt). Memo fields
can never be used.

Second thing to check if there are data in the tables already, is that
the existing records satisfy the rules. Empty values in the FK field are
okay (nulls, not zero-length strings), but any values that are there must
exist in the related table IYSWIM.
Ok, just to be sure I'm not screwing it up: for my junction tables, I
don't need a primary key, just a unique index based on the two fields,
with no nulls or zero length strings allowed, right?

John V has answered this. If you already have the unique index, you just
have to mark in Primary and you are done.

Hope that helps -- yes, do post back if you have any further questions.
All the best


Tim F
 
F

Fred Boer

Hi Tim!
First thing to check is that the field types match exactly: if one is a
Text(length=12) then the other one must be a Text(12) too. Same applies
to Single-Single, LongInt-LongInt etc. Remember that Autonumber is a
special kind of LongInt (and therefore matches a LongInt). Memo fields
can never be used.

I have managed to get that problem sorted out. But I'm glad to get the
additional informatoin about matching the text field lengths.. I can imagine
getting tripped up by that!

Second thing to check if there are data in the tables already, is that
the existing records satisfy the rules. Empty values in the FK field are
okay (nulls, not zero-length strings), but any values that are there must
exist in the related table IYSWIM.

I do SWYM!
John V has answered this. If you already have the unique index, you just
have to mark in Primary and you are done.

Yes, thanks!

So far I've redone three or four tables, and I think I have the hang of it
now... I have a number of tables that seem to have suitable natural keys,
and it does seem to simplify things in the relationship window... and I see
how having the natural key in the main table is efficient... La-de-dah...
always something new to learn!!

Thanks for helping me!
Fred
 
F

Fred Boer

Dear Tim:

So, I re-did most of my tables, creating natural keys, setting unique
indexes, and making the back end MDB look all nice. Then I ran the front end
application to see how it would go. Oh, MAN, is it going to be a lot of work
to fix everying! What a mess! <g> Boy, when they say that you should get the
structure down correctly from the beginning, they aren't just whistling
Dixie!!

That aside, it's been interesting and fun to learn about how to use natural
primary keys! Thanks again for helping me along!

Cheers!
Fred
 
T

Tim Ferguson

Boy, when they say that you should get the
structure down correctly from the beginning, they aren't just

I think I've got one of those tee-shirts too... Oh look, a whole box full
of them :)
That aside, it's been interesting and fun to learn about how to use
natural primary keys! Thanks again for helping me along!

I am an instinctive natural-key user, but do be careful. Lots of things
that look like keys are actually bad choices: Soc Security Numbers,
Patient Hospital Numbers, FirstName+LastName all have problems. We had a
discussion on this forum about whether car registration marks would be
suitable. And so on. The long rule is that a key must be Always Unique,
Always Invariable, and Always Available (and Always <> Practically all of
the time). The short rule is: Be Very Sceptical. Because, as you have
found out, it is a horrible thing to have to go back and fix it later.

Welcome to the the Science^W Art^W Wonderful World^W^W ... oh well,
welcome to DB Design!

All the best


Tim F
 
F

Fred Boer

Dear Tim:

I hope you haven't gotten the hook of this thread out of your mouth yet!
;)..
I read your last post with interest, and I'll take your warning seriously..
Might I continue to pick your brain a little more?
I think I understand "Always Unique" and "Always Invariable", but could you
expand on "Always Available"?

For example..

I have created a small library database application. One of the tables lists
the names of book series. I thought I might use the series name itself as
the natural primary key for that table (TblSeries). So the table consists of
one text field. The series name for a book wouldn't change (always
invariable), each series name would be unique (Always Unique), but how does
"Always Available" fit in here...

Another example:

I know you say "Always invariable". Suppose I have a table of book formats
(Braille, Talking Book, Large Print, etc.), and I use the name of the format
as the primary key. So the table consists of one text field with as many
records are there are book formats. Now, suppose I decide I need to change
"Large Print" to "Big Print". If I set the join to use Cascade Update,
couldn't I make this change? But that would violate "Always invariable",
right?

I hope I haven't worn out your patience, Tim! I really appreciate your help!

Thanks!
Fred
 
T

Tim Ferguson

Good questions about how to choose a PK.
The series name for a book wouldn't change (always invariable),
each series name would be unique (Always Unique),

This is the one I would be most suspicious of. People are always calling
things by the same name!
but how does "Always Available" fit in here...

I would guess that if you there is a Series you'll know what it's called!
What I meant about avalailable refers particularly to external
identifiers: Hospital Patient Number, SSN, and so on. ISBN probably
applies here -- there will be books that are too old to have one or too
clobbered to be able to read it, so you have a problem when you try to
enter the book. With a PK value, you cannot just leave it blank and fill
it in later.
Another example:
....
Now, suppose I
decide I need to change "Large Print" to "Big Print". If I set the
join to use Cascade Update, couldn't I make this change? But that
would violate "Always invariable", right?

Yes, you _can_ use Cascade Updates to do this and in most situations that
is a practical solution. To my mind, however, it's a messy answer
compared to getting the thing right in the first place. And you do rely
on the db engine itself echoing a number of changes throughout the
database without choking or failing or having a power surge or... <g> And
it means that any distributed backups (paper printouts etc) are seriously
out of synch. Still, this is the sort of question that sorts the academic
goats from the real-world sheep: you have to balance the risk and the
inconvenience of one versus neatness and self-documentingness (huh?) of
the other.

All the best


Tim F
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top