Data Primary key vs. Artificial (Autonumber) primary key

J

Jamie Collins

I don't believe I said that autonumbers should be exposed.

You hadn't said that Autonumbers shouldn't be exposed either ;-)
Clustered indexes are a double-edged sword

Better IMO to try to wield the sword rather than ignore it and risk
stabbing yourself in the foot <g>.

[BTW I never got that idiom: surely a double-edged blade pierces the
victim much more effectively than a single edged blade, making a
better weapon of attack...?]

Jamie.

--
 
J

John W. Vinson

I encourage people to use the random flavour
of Autonumber to discourage this practice (and persistent offenders
should be forced to use GUIDs <g>).

eeep! Is that allowed under the Geneva Convention?

<bg>

John W. Vinson [MVP]
 
D

Dale Fye

I disagree, on two points.

1. I design my schemas as close to 3rd or 4th normal form as feasible.
But, rather than carry around the excess baggage of a 3, 4, or 5 field PK, I
have elected to take the path that is most efficient given my chosen
front/back tools (Access). However, even when I get the opportunity to take
advantage of the server side processing that SQL Server provides, I still use
a 'meaningless' PKs to speed development (makes my job easier) and minimize
the risk of mistakes (as one of the other posts mentioned, an update query
where one of the joins is missing can be disasterous).

2. I really enjoy the challenge of developing an application that is both
functional and user friendly. Unlike most "programmers", I also consider
myself to be extremely literate in the "database" side of things, and take
great exception with programmers who kludge their database together based on
their desired form layout.

Dale
--
Don''t forget to rate the post if it was helpful!

email address is invalid
Please reply to newsgroup only.
 
T

Tony Toews [MVP]

Jamie Collins said:
This implies you design your database schema (tables etc) for the
convenience of your front end (forms and subforms) and/or for your own
convenience?!

That would be "and" not "and/or". And yes that's a small part of the
reason yes.
My approach is to design the database schema entirely
independent of the front ends (which is done by another person
anyhow).

I do everything.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
 
D

David W. Fenton

:
The main issue I have with the 'Autonumber PK' movement is that
the message about having table constraints on the candidate keys
(and other business rules) doesn't always come across. You've said
the right things (IMO) about so-called surrogates in this thread
but I would wager than most users in the 'Autonumber PK' camp put
a PK on the Autonumber and think, "Job done" as regards table
constraints.

I'm not sure I think it's a good idea to define "best practices"
based on trying to avoid what stupid people do. If someone fails to
put in place proper constraints on fields other than your surrogate
key, then that's pilot error. It's a mistake.

On the other hand, in a lot of situations, forcing someone to use
the natural key can lead to all sorts of other problems, like
default values to avoid Nulls, and then you have to program around
those (e.g., you won't be able to use this to concatenate the name
fields: Mid(("12" + [LastName]) & (", " + [FirstName]),3), because
with default values, none of your "empty" fields will be Null).

Seems to me you're trading one set of problems for another.

I'm assuming that people will not handle the multi-column key
correctly and you're assuming that they won't handle their
constraints on non-PK keys. Six of one, etc., so I see no reason why
your approach would be superior on the basis of the argument made
above, because to me it applies equally well (and is, in fact,
worse) with multi-column natural keys (I have no objection on
principle to single-column natural keys, though I use them only on a
case-by-case basis).
Another significant issue is when Autonumber values are exposed to
users. Most informed people agree this should be avoided but again
the message isn't always cleat and so it does happen.

This is, once again, an example of pilot error.
In an ideal world,
surrogates would be implemented under the hood and not even be
exposed to DBA, developers, etc. I encourage people to use the
random flavour of Autonumber to discourage this practice (and
persistent offenders should be forced to use GUIDs <g>).

But random Autonumber PKs have problems with clustering and table
order (although in high-concurrency environments, it can be turned
into an advantage).

I know you're joking about GUIDs, but just in case someone reading
this takes you seriously, they should read this:

http://trigeminal.com/usenet/usenet011.asp?1033

(the article is specifically about replication, but many of the
problems with GUIDs in Access apply equally to non-replicated MDBs)
 
D

David W. Fenton

:
My approach is to design the database schema entirely
independent of the front ends (which is done by another person
anyhow).

I'll bet you have developers cursing you all the time.

I learn a lot about how the properties and structure of the entities
represented in database schema in the process of designing the UI.
It's a two-way process, with feedback flowing from the schema design
to the UI and back again.

This is not to say that the schema design is driven by the UI, only
that in designing a UI *for people* one can learn things about the
schema that were not in the spec, or not obvious from it.
 
T

Tony Toews [MVP]

Jamie Collins said:
In an ideal world,
surrogates would be implemented under the hood and not even be exposed
to DBA, developers, etc. I encourage people to use the random flavour
of Autonumber to discourage this practice

I disagree. Sometimes when trouble shooting problems it's a lot
easier to remember two or three digit numbers when viewing data from
different tables. I'm using my own test databases with a few or a
few dozen records. Later I'll test against client databases.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
 
T

Tony Toews [MVP]

M. said:
What still interests me, however, is how pepole prevent entry of duplicate
records when they use a autonumber field as primary index. This choice
doesn't offer you any guarantee that the real data in your record is
accidentally entered a second time in a later record. In my opinion you would
still need a unique single (e.g. ISBN number) or multifield index (e.g. book
title + author name) to prevent duplicate entry. In the end, it's then a
matter of taste whether you name the autonumber field index primary or the
composite field(s) index primary.

Like Dale said I use either unique indexes on other fields if
applicable and which surprisingly often aren't possible. So now you
have to use some business logic and depend on humans.

Book titles can be duplicate (Golden is one example with two different
authors in 2006) as well as having multiple authors, translators,
illustrator, etc.

Humans can have duplicate names and birth dates. Or Fred Jones and
his son Fred Jones live at the same house.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
 
J

Jamie Collins

I design my schemas as close to 3rd or 4th normal form as feasible.

Doubtful <g>. 3NF is a historical anomaly and 4NF is rarely desirable.
I'd wager you actually achieve BCNF and 5NF more often than you
think :)
But, rather than carry around the excess baggage of a 3, 4, or 5 field PK, I
have elected to take the path that is most efficient given my chosen
front/back tools (Access). However, even when I get the opportunity to take
advantage of the server side processing that SQL Server provides, I still use
a 'meaningless' PKs to speed development (makes my job easier) and minimize
the risk of mistakes (as one of the other posts mentioned, an update query
where one of the joins is missing can be disasterous).

It seems you use a so-called surrogate (what Codd meant by 'surrogate'
is open to debate e.g. when he said users "have no control over its
value, nor is its value ever displayed to them" did he mean DBAs and
Access Developers?) for FK references. Well, that's your design choice
and, assuming you have table constraints (rather than front end code)
to cover your candidate keys, then fair enough; I'd hope you wouldn't
use another's threads to evangelize <g>. However, note the essence of
the OP's question seems to me to be: given that someone has used
Autonumber as a so-called surrogate, what reasons would they have for
not making the Autonumber the PK? I cordially invite you to attempt to
answer that question.
I ... take
great exception with programmers who kludge their database together based on
their desired form layout.

Then why did you reply to me rather than take great exception with
Tony Toews [MVP] who made the point...?
Don''t forget to rate the post if it was helpful!

....Ah, perhaps I see why ;-)

Jamie.

--
 
J

Jamie Collins

I'll bet you have developers cursing you all the time.

We have design reviews where discussion is encouraged. Cursing is not
a great way of raising an objection, IMO.
I learn a lot about how the properties and structure of the entities
represented in database schema in the process of designing the UI.
It's a two-way process, with feedback flowing from the schema design
to the UI and back again.

This is not to say that the schema design is driven by the UI, only
that in designing a UI *for people* one can learn things about the
schema that were not in the spec, or not obvious from it.

If you are saying that your logical model is only as good as the
domain expert's spec then I am in full agreement.

Jamie.

--
 
J

Jamie Collins

I disagree. Sometimes when trouble shooting problems it's a lot
easier to remember two or three digit numbers when viewing data from
different tables. I'm using my own test databases with a few or a
few dozen records. Later I'll test against client databases.

And if you make it the PK, your client and their users will end up
with the double whammy of poor concurrency and poor clustering. But,
hey, the convenience of the developer is what counts for you, eh <g>?

Jamie.

--
 
J

Jamie Collins

I use either unique indexes on other fields if
applicable and which surprisingly often aren't possible.

That's why we were given table-level CHECK constraints.
Book titles can be duplicate (Golden is one example with two different
authors in 2006) as well as having multiple authors, translators,
illustrator, etc.

I sure wish someone would hurry up and invent an industry standard
identifier with a trusted source which will be familiar to users. Oh,
hang on: they already did (ISBN).
Humans can have duplicate names and birth dates. Or Fred Jones and
his son Fred Jones live at the same house.

Fred Jones's credit card number is unique and he is not supposed to
disclose his PIN number but how do you know it is not his son Fred
Jones using it? Fred Jones has different fingerprints to Fred Jones
but are you authorized to retain that data? It's all about trust i.e.
you need a trusted source for identifiers.

An Autonumber in an Access database cannot verify people in reality
therefore cannot be used as a trusted source. Yes that's a straw man
argument, as are your faulty examples of natural keys.

Jamie.

--
 
P

Pat Hartman

If you want to conduct an intellectual discussion, you really need to be
able to handle it when people disagree with you and stop taking cheap shots.
 
T

Tony Toews [MVP]

Jamie Collins said:
And if you make it the PK, your client and their users will end up
with the double whammy of poor concurrency and poor clustering. But,
hey, the convenience of the developer is what counts for you, eh <g>?

How does clustering apply to Access databases? And I've only ever had
one problem with concurrency in all these years. So not a problem
for me.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
 
J

Jamie Collins

If you want to conduct an intellectual discussion
you really need to be
able to handle it when people disagree with you and stop taking cheap shots.

If I really was conducting this discussion I would have answers to my
direct questions (ping Dale Fye: "given that someone has used
Autonumber as a so-called surrogate, what reasons would they have for
not making the Autonumber the PK?")

And if I couldn't handle people around here disagreeing with me I'd
have left years ago!

FYI:

The Access Web: Netiquette
http://www.mvps.org/access/netiquette.htm
Be thoughtful of bandwidth and other folks' thought processes:
Look for Smileys :), grins <g>, and other "Emoticons". When
you see one, the preceding statement was not meant to be really
serious, don't take it as such. [Unquote]

I don't think the poster was being serious either (hence my short
dismissal with a grin). Designing database for one's own convenience
rather than your client's doesn't sound like a credible proposition to
me.

Jamie.

--
 
J

Jamie Collins


I broadly agree with everything you've said here. I do still think
that many (perhaps most) users of Autonumber columns use it for their
one and only unique constraint per table; for the record, I think such
people are ill informed rather than stupid. Some should now be better
informed after reading your post :)

One point:
in a lot of situations, forcing someone to use
the natural key can lead to all sorts of other problems, like
default values to avoid Nulls, and then you have to program around
those (e.g., you won't be able to use this to concatenate the name
fields: Mid(("12" + [LastName]) & (", " + [FirstName]),3), because
with default values, none of your "empty" fields will be Null).

I don't design my schemas to fit your ready-rolled formulas <g>!. I
have no doubt that if I define my middle_name column as NOT NULL with
a DEFAULT of '{{NONE}}' (plus CHECK constraints to disallow the empty
string, spaces, etc) then you would be able to come up with a similar
formula to derive a full name. My create/update stored procs handle
the NULL value and the empty string so it shouldn't be too onerous on
the 'front end guy' to replace '{{NONE}}', '{{NK}}', etc, with
whatever is appropriate in context.

That said, person's name rarely makes a good key, so is not IMO a good
example of a natural key. My favourite is the ISBN: industry standard
(ask at your local public library or look at the URLs on Amazon),
trusted source (will resolve rare cases of duplication), familiar and
verifiable in reality (take a look at the bar code on the back cover
of your favourite book), stable (sure, they have recently changed from
10 to 13 characters but this has been managed by the trusted source to
provide an upgrade path), fixed length with a check digit (easy to
validate at time of keying in), a so-called 'intelligent' key (it
comprises detail of region, publisher, etc) and so on.

Jamie.

--
 
P

Pat Hartman

I think the answer you've gotten is that none of us can think of a good
reason to include an autonumber in a table if it is NOT the primary key.

Jamie Collins said:
If you want to conduct an intellectual discussion
you really need to be
able to handle it when people disagree with you and stop taking cheap
shots.

If I really was conducting this discussion I would have answers to my
direct questions (ping Dale Fye: "given that someone has used
Autonumber as a so-called surrogate, what reasons would they have for
not making the Autonumber the PK?")

And if I couldn't handle people around here disagreeing with me I'd
have left years ago!

FYI:

The Access Web: Netiquette
http://www.mvps.org/access/netiquette.htm
Be thoughtful of bandwidth and other folks' thought processes:
Look for Smileys :), grins <g>, and other "Emoticons". When
you see one, the preceding statement was not meant to be really
serious, don't take it as such. [Unquote]

I don't think the poster was being serious either (hence my short
dismissal with a grin). Designing database for one's own convenience
rather than your client's doesn't sound like a credible proposition to
me.

Jamie.
 
J

Jamie Collins

I think the answer you've gotten is that none of us can think of a good
reason to include an autonumber in a table if it is NOT the primary key.

I'm assuming that by 'primary key' you mean the SQL keyword PRIMARY
KEY rather than the logical meaning of 'primary key', on the basis
that an Autonumber only comes into existence in the physical
implementation and has no place in the logical model. I'm also
assuming that by 'us' you aren't alienating me <g>.

1) Clustering: as an extreme example, imagine a table with a ten-
column compound key and you wanted to cluster on just one of those
columns being a DATETIME consisting of non-unique values. In SQL
Server, if you created a clustered index solely on a column without a
unique constraint then the system would create an auto-increment
column under the hood to 'uniqueify' the values. In Jet, the only way
of influencing clustering is to use the PRIMARY KEY designation and
Jet will not allow you to create a PK solely on a column containing
non-unique values. You can do the same trick as SQL Server but you
have to explicitly create the 'uniqueifier' yourself and an Autonumber
column is a very convenient way of doing this. You'd then create a
compound PK on (datetime_col, unique_col) _in that order_ of course
(BTW how does one do this using the Access interface rather than
code?) In this scenario, the autonumber in the table is not the
primary key, though it is _part of_ the PRIMARY KEY.

2) PRIMARY KEY is arbitrary: you can put it on *any* columns (bar
those of type OLEOBJECT) in the table (unless existing values are NULL
and/or non-unique) or you can opt to omit a PRIMARY KEY completely: if
you have a unique constraint (e.g. unique index and NOT NULL
designation) on a candidate key I don't think anyone could fault you
(though they could point out that you were potentially missing out on
an opportunity for optimization). Therefore, a table with an
Autonumber column but without a PRIMARY KEY is perfectly acceptable.

I'm sure there are other reasons which smarter people than me could
come up with; perhaps a little thinking outside the box is required?

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top