Data Primary key vs. Artificial (Autonumber) primary key

Jeff Boyce · Dec 5, 2007

Oh, you are in so much trouble now!

Just because Douglas and I (Jeff) have traditionally male surnames doesn't
mean we're males... (oh wait, we are, never mind<g>)

Regards

Jeff

Jeff Boyce · Dec 5, 2007

Thanks! I suspect you're right.

I don't use Autonumbers, ... except as Primary Keys, so they are, by (my)
definition, unique!

I keep forgetting that not everyone shares MY definition!

--
Regards

Jeff Boyce
www.InformationFutures.net

Microsoft Office/Access MVP

Microsoft IT Academy Program Mentor
http://microsoftitacademy.com/

Jamie Collins · Dec 5, 2007

Just because Douglas and I (Jeff) have traditionally male surnames doesn't
mean we're males... (oh wait, we are, never mind<g>)

I've also seen the 'traditionally male' beards in the MVP mug
shots ;-)

Jamie.

--

Jeff Boyce · Dec 5, 2007

Facial hair is not restricted to males...

(hence the facination of the "Bearded Lady" in carnivals!)

You are STILL in trouble!

Regards

Jeff Boyce
Microsoft Office/Access MVP

Jamie Collins · Dec 5, 2007

Facial hair is not restricted to males...

....in the set of Access MVPs?

Jamie.

--

Dennis · Dec 5, 2007

I was speaking from a pure theory perspective, not from an Access-specific one.

OLE objects aren't "data" per se; they're, well, OBJECTS. And memo fields
cannot be used as indexes or searched on, so as a key they'd be impossible to
use. In the "old-timey world" one needed to keep one's keys as short as
possible while being meaningful. A key-length over about 15 characters almost
guaranteed terrible application performance. But that was then....

John W. Vinson · Dec 5, 2007

"If you're going to create relationships, they will
always be based on the primary key. You cannot change that."

Michael Gramelspacher said (paraphrasing): "I can create a FOREIGN KEY
REFERENCE (a.k.a. Access Relationship with referential integrity
enforced) based on a UNIQUE constraint, which goes against what Jeff
Boyce said about relationships always being based on the primary key."

You're right (about the meaning if not the attribution), Jamie - I
misinterpreted Michael's point.

John W. Vinson [MVP]

John W. Vinson · Dec 5, 2007

And that means at least occasionally correcting or
changing it.

well... there *is* Cascade Updates to cover that possibility.

But I agree, it's ideal if the primary key is rock solid stable. If it's only
San Andreas faultline rock solid stable, you need to depend on cascade
updates, which do work but have their own problems!

John W. Vinson [MVP]

M. · Dec 5, 2007

Thanks for this helpful answer.

Is the following summary correct:

The (unique) index that would be used most frequently for sorting or
filtering the table records, should be defined as primary index from a
performance point of view, because the primary index determines the
sequential physical order on harddisk ?

M.

Fred · Dec 5, 2007

I'm from Chicago, we don't know what fault lines are. Could you use a
tornado analogy so that I can understand?

:-)

Fred

Jeff Boyce · Dec 5, 2007

(psst, Fred, can you say New Madrid?)

Jeff

Fred said:
I'm from Chicago, we don't know what fault lines are. Could you use a
tornado analogy so that I can understand?

Fred

David W. Fenton · Dec 6, 2007

Why? By definition, an Autonumber is supposed to already be
unique, so you wouldn't gain anything by indexing it.

Eh? The unique index has an effect on the type of relationship. If
both sides of the join lack a unique index, it's an undefined
relationship type. If one side does, it's a many-to-one. If both
sides do, it's a 1:1.

David W. Fenton · Dec 6, 2007

My main question is: are there negative aspects associated with
using a primary key based on data fields versus using a primary
key based on an artificial primary key as generated with an
autonumber field? In both cases the autonumber field would be used
for defining relations between tables.

Until now my answer would be: there are no negative aspects
associated with the data fields approach.

Have you tried it? For instance, try creating a join between two
tables where the key they are joined on has 5 fields in it.

My philosophy:

Single-field natural PKs are OK, though they should be for data that
is never updated once entered.

Multi-column keys are a horrid amount of work, and duplicate a
helluva lot of data.

It also makes your joins less efficient.

Last of all, very few of the tables in the apps I'm asked by clients
to build can have no Nulls, so natural keys are only usable if you
assign a default value. If you use a zero-length string, then that
means you can't query on Null, and it means that you can't propagate
Nulls in concatenation operations without using an intermediary
function to convert ZLS's to Null.

David W. Fenton · Dec 6, 2007

The (unique) index that would be used most frequently for sorting
or filtering the table records, should be defined as primary index
from a performance point of view, because the primary index
determines the sequential physical order on harddisk ?

No, unless you're limiting the discussion to Jet.

In SQL Server, for instance, you can cluster on any unique index
(i.e., write the table in that index's order). It doesn't have to be
the PK.

Jamie Collins · Dec 6, 2007

I'm from Chicago, we don't know what fault lines are. Could you use a
tornado analogy so that I can understand?

I'm from the UK, where both weather and geology is relatively benign.
So that we all may understand the point, can you please use an example
(in place of an analogy) of an industry standard key with a trusted
source which manages changes. As a suggestion, consider ISBN which
recently changed from 10 to 13 characters (which Amazon and many
public libraries seem to have survived <g>) and for which duplication
occasionally occurs.

Jamie.

--

Jamie Collins · Dec 6, 2007

Thanks for this helpful answer.

Is the following summary correct:

The (unique) index that would be used most frequently for sorting or
filtering the table records, should be defined as primary index from a
performance point of view, because the primary index determines the
sequential physical order on harddisk ?

Not exactly. To repeat: "This allows Microsoft Jet to take full
advantage of its read-ahead cache and also reduces disk I/O when doing
sequential scans of a table." The best example I can think of for
sequential scans of a table is using BETWEEN in SQL because the data
would be on contiguous pages. Tthink of a paper copy telephone
directory ordered by last name then first names and I asked you to get
me all the telephone numbers of people whose last names begin with the
letter 'C' and how easy it would be to rip out the appropriate pages,
then think about how difficult it would be to do the same if the task
was to get all the telephone numbers which began with the numeral '5'.
The next best example would be GROUP BY.

I suggest you do some research on clustered indexes *generally*; you
won't find much in the Jet world because most users disregard or are
unaware of (or even are in denial of <g>) the effects of clustering in
Jet. David W. Fenton's suggestion of looking at SQL Server (where you
have an explicit choice of clustered index) seems a good one. Which
reminds me, David came up with the brilliant suggestion of making a
random Autonumber column the sole PK to favour concurrency (because
values created close together in transaction time have more chance of
appearing on *different* pages).

Jamie.

--

Jamie Collins · Dec 6, 2007

=?Utf-8?B?TS4=?= <[email protected]> wrote innews:[email protected]:

Have you tried it? For instance, try creating a join between two
tables where the key they are joined on has 5 fields in it.

Multi-column keys are a horrid amount of work, and duplicate a
helluva lot of data.

Isn't this the problem Access Relationships were invented to solve?
i.e. defining the Relationship on those five columns is a one-time
task, after which you simply add the tables to the query builder tool
thing and it types out the JOIN syntax for you. Me, I can type fast
(and the parser does spell chekcing for me <g>) so I don't really mind
the 'hard' work. And if you are concerned about disk space, port to a
back end without a 2GB (or whatever) limit and buy a bigger drive
(relatively inexpensive these days).

Using meaningless keys in your tables results in many table joins to
get basic key data so you have to balance these things out. I'm wary

of people who use the same hammer to bash every kind of fixing said:
Last of all, very few of the tables in the apps I'm asked by clients
to build can have no Nulls, so natural keys are only usable if you
assign a default value.

I also have only a few nullable columns (subatomic elements, usually
of type DATETIME) and absolutely zero columns that allow a zero length
string but still your point alludes me. I suspect the answer is that
you are have to denormalized tables but could you please clarify with
an example? TIA.

Jamie.

--

Jamie Collins · Dec 6, 2007

I was speaking from a pure theory perspective, not from an Access-specific one.

I get into trouble myself for doing that said:
OLE objects aren't "data" per se; they're, well, OBJECTS.

LOL! Did I interpret the word 'data' too literally <g>? Is OLEOBJECT a
*data* type? Can columns of type OLEOBJECT be found in a Jet
'database'?

And memo fields
cannot be used as indexes or searched on

I don't know what you mean but memo fields can be indexed and can be
used in WHERE and HAVING clauses; you can cast them using CSTR() in
SQL code to prevent them from being truncated in certain
circumstances.

Jamie.

--

Rod Plastow · Dec 6, 2007

Jamie,

You forgot to mention the penultimate phone number change that meant phone
numbers would never have to change again - that is until the next phone
number change about four years later. :-)

Let me nail my colours to the mast; I belong to the religious sect of a
'meaningless' pk. I believe the job of a pk is to identify the row and
nothing more and I've the dog collar to demonstrate my faith - well that is
until I come across a situation where to include a meaningless key simply
doubles the effort such as an accounting period table that I key with yyyymm
- surely there will never be another Pope Gregory!

As for 'Rock Solid:' I'm in the Philippines and we experienced an earthquake
last week so geological analogies do not work here.

Regards,

Rod

Jamie Collins · Dec 6, 2007

Let me nail my colours to the mast; I belong to the religious sect of a
'meaningless' pk. I believe the job of a pk is to identify the row and
nothing more and I've the dog collar to demonstrate my faith

I'm borderline 'atheist' because I don't use the PRIMARY KEY on every
table, and I have a relationally sound basis for doing so because a
key is a key is a key; calling one 'primary' is arbitrary.

But really it has nothing to do with 'religion': PRIMARY KEY behaves
the same in Jet regardless of your beliefs. So let's stick to the
facts.

What does PRIMARY KEY do for you that other keys ('non-primary keys'?)
do not? Doesn't allow the NULL value but I've got NOT NULL at the
column level for that. Let's you omit the PK columns in the RFERENCES
clause in SQL DDL but I prefer to write things out longhand. Puts the
columns in bold text in the Access UI but I don't look at the
Relationships picture (I find SQL code easier to interpret). So
clustering is the only aspect of PRIMARY KEY which I find appealing
for Jet. But try telling someone to use NOT NULL UNIQUE for their
existing PKs and leveraging the clustering nature of PRIMARY KEY as a
'physical' index and in most cases it's too much of a paradigm shift
for them to comprehend.

Jamie.

--

unique composite key	6	Sep 22, 2008
compound primary key versus autonumber	11	Dec 30, 2005
AutoNumber vs. Natural Primary Keys	6	Jan 7, 2010
How to prevent duplicates of non-primary key	14	Jan 21, 2008
Correct design and primary keys for linking multiple tables	4	Feb 20, 2009
Composite Primary Key with an ID and Index	2	Aug 18, 2009
Change primary key from autonumber to text field?	1	Feb 27, 2008
Are Primary Keys Necessary?	5	May 11, 2007

Data Primary key vs. Artificial (Autonumber) primary key

Jeff Boyce

Jeff Boyce

Jamie Collins

Jeff Boyce

Jamie Collins

Dennis

John W. Vinson

John W. Vinson

M.

Fred

Jeff Boyce

David W. Fenton

David W. Fenton

David W. Fenton

Jamie Collins

Jamie Collins

Jamie Collins

Jamie Collins

Rod Plastow

Jamie Collins

Ask a Question

Similar Threads