Autonumber Fields

G

Guest

Access allows a table to have an autonumber field which could be considered a
record number.

In an application I am involved in developing there are a number of code
tables which are using the autonumber field as the code. This is autonumber
code is then used in the tables holding the data.

I am seeking opinions on this approach. Is this a reasonable practice? What
are the dangers of doing this?

I come from a mainframe environment where this sort of approach is avoided
by generating a unique code value which is not an effective record number as
a record is added.
 
R

Rick Brandt

Denis said:
Access allows a table to have an autonumber field which could be
considered a record number.

In an application I am involved in developing there are a number of
code tables which are using the autonumber field as the code. This is
autonumber code is then used in the tables holding the data.

I am seeking opinions on this approach. Is this a reasonable
practice? What are the dangers of doing this?

I come from a mainframe environment where this sort of approach is
avoided by generating a unique code value which is not an effective
record number as a record is added.

An AutoNumber is also a generated value that is NOT an effective record number.
Its only guarantee is uniqueness, not a gapless progression of incrementing
numbers.
 
R

Roger Carlson

Autonumber fields make excellent Primary Keys. That's what they were
designed for, and as Rick said, they are not designed to be a record number.
They are used only to create a guaranteed unique value for relating tables.
I use them in every database and highly recommend their use.

--
--Roger Carlson
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
J

Jeff Boyce

Denis

As Rick & Roger point out, autonumbers are designed to serve as unique row
identifiers. If this matches your definition of "record number", fine. You
can find some folks who are willing to expose autonumbers to their users,
but I find them (the autonumbers, not the users or the folks) unfit for
human consumption.

I make extensive use of tables of lookup values, and, where suitable, use
autonumbers IDs on those tables.

JOPO (just one person's opinion)

Jeff Boyce
<Access MVP>
 
P

peregenem

Roger said:
Autonumber fields make excellent Primary Keys.

You've misunderstood what PRIMARY KEY means. An unique integer which
has no meaning in respect fo the entities being modelled makes a lousy
PRIMARY KEY. Google for "clustered index" in the Access groups.

An autonumber is a convenient uniqueifier but unquieness for its own
sake make not be such a good thing.
 
B

BruceM

That you disagree with somebody does not make that person wrong. Roger has
provided a wide range of assistance in this forum, and has made samples
available on his web site. Based on his track record I would be inclined to
follow his advice. If you are trying to convert people to the idea of using
clustered indexes, a very basic discussion of what they are would be most
helpful. I have taken your suggestion to look at Google groups. There is
indeed a lot of discussion, but I have not yet found how I would create a
clustered index if I wanted to. My databases with a few thousand records
seem to work just fine. Why would I want to put extra effort into something
that already works well? I know you have posted code that includes MAKE
TABLE or some such, but the utility of such code is not clear. The other
thing I noted in Google groups is that most of the discussion of clustered
indexes seems to be in discussions about SQL server.
 
C

Craig Alexander Morrison

Jet 4.0 and 3.5 (and earlier versions) cluster on the Primary Key and a
Compact will keep it managed.

Indeed a clue to this is the Registry entry for:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines

both contain the setting CompactByPKey.

I am not sure what would happen if you changed the above setting from 1, I
expect 0 would skip the clustering - I am not sure if any other setting
would be valid.

SQL Server generally clusters on the Primary Key, however, you can select
another index.

AutoNumbers are very poor devices to truly define a unique record in the
real world, You can enter the name John Doe 1,000,000 times in your database
if the Primary Key is an AutoNumber and you have failed to do something to
prevent the creation of 1,000,000 John Doe's. You may have 1,000,000 unique
records but so what?

Recommending the AutoNumber as Primary Key without pointing out the dangers,
and suggesting the definition and declaration of the natural key (should one
exist), is unwise.

BTW A clustered index is merely a physical ordering of the records in a
table in the database file. Using the true natural key (should one exist) as
the primary key will ensure that all the records with a similar PK will be
physically located next to each other. Using an AutoNumber (sequential
order) as PK will mean the records are clustered according to their creation
order. Using IDENTITY and AutoNumber as PK defeats the purpose of PK, this
is not so bad in SQL Server as it allows you to choose something more
sensible if you have an IDENTITY field in use as PK.
 
C

Craig Alexander Morrison

Using IDENTITY and AutoNumber as PK defeats the purpose of PK, this
is not so bad in SQL Server as it allows you to choose something more
sensible if you have an IDENTITY field in use as PK.


When I wrote that it defeats the purpose of the PK it should have read it
defeats the purpose of clustering on the PK unless the creation order is the
one you want to physically order your records.

--
Slainte

Craig Alexander Morrison
Crawbridge Data (Scotland) Limited
"Craig Alexander Morrison" <[email protected]>
wrote in message news:[email protected]...
 
B

BruceM

Thank you for the explanation. It makes sense that it has to do with
physical ordering in a table rather than on the disk. Having said that, I
cannot discover the connection between indexes, the table's Order By
property, and anything else that suggests an order within the table, on the
actual order of records in the table. Order By, in particular, seems to
accomplish nothing.
Regarding John Doe, it may well be a name used by more than one person. How
does this fit in with clustered indexes? I may need duplication in that
field.
Suppose I wanted to create a clustered index in an Access table. How would
I do that? The term does not appear in Access Help, and discussions of the
subject tend to assume the reader knows what a clustered index is and how to
create one. Even if one is created, what benefits will I notice?
 
P

peregenem

BruceM said:
That you disagree with somebody does not make that person wrong.

That someone has a track record of providing a wide range of assistance
does not make that person right ;-)

In this case, I disagree with the person because they do not understand
what PRIMARY KEY means. Autonumber does not make a good uniquifier, let
alone a good PK (different concepts). Remember this list? the
advantages of using autonumber are:

1. Convenience, because it's provided by the 'system';
2. It's an 'efficient' data type;
3. Erm...
4. That's it!
 
C

Craig Alexander Morrison

Physical is Disk!

Clustered Index can only be a Primary Key.

You may need several fields to define uniqueness, several fields can make up
an index and a primary key which is actually an index also as opposed to a
field.

Order can be anything you want whenever you want it using SQL. If you are
going to sort by a specific field or combination of fields you may consider
adding an index to that field or combination of fields.

Indexes speed things up when sorting and analysing data, they can slow
things down if you are inserting data, especially bulk updates.
 
B

BruceM

What would you do to guarantee uniqueness in a Contacts table or some such
involving names and addresses, in light of the fact that names and addresses
are subject to change?

SQL underlies Access queries. The design grid is a sort of SQL GUI (as I
understand it). So I think you're saying that displayed order (e.g. sorted
by last name) is not what you are talking about when you talk about physical
order. If I understand, you are saying that the structure of the index
determines the order on the disk, not the order in the table when it is
viewed directly.

I have a database that includes an Employees table. The primary key is the
EmployeeID. With it to do over again I might have used something else,
because it is at least possible that they will one day change the format of
EmployeeID, which is just a sequential 4-digit number. In most cases I sort
the employee names by last name. Would adding an index to that field maybe
speed up some operations, even though the list is rather small (fewer than
100 current employees, along with a number of former employees)?
 
A

Amy Blankenship

BruceM said:
What would you do to guarantee uniqueness in a Contacts table or some such
involving names and addresses, in light of the fact that names and
addresses are subject to change?

SQL underlies Access queries. The design grid is a sort of SQL GUI (as I
understand it). So I think you're saying that displayed order (e.g.
sorted by last name) is not what you are talking about when you talk about
physical order. If I understand, you are saying that the structure of the
index determines the order on the disk, not the order in the table when it
is viewed directly.

I have a database that includes an Employees table. The primary key is
the EmployeeID. With it to do over again I might have used something
else, because it is at least possible that they will one day change the
format of EmployeeID, which is just a sequential 4-digit number.

Which is why most people use completely meaningless Autonumber fields as
primary keys. Because you can't change the value, format, or anything else
of a field that is currently being used as a primary key. Also the
autonumber field will usually have a smaller size (on disk, no less) than a
more meaningful key. Therefore, if you are using it in a relationship or
relationships, the other tables will have to store less information when
they are referring to that primary key of this table.

So, for instance, if you had an employeeID that was an autonumber, all of
the other tables that refer to your EmployeeID would have saved 11 bytes
every time they had a foreign key to your employee table, and you could have
stored what is now your employeeID primary key just once, for a total of
just the one 15 byte storage of the employeeID string. This is the whole
point of normalization. Anything that is actually used as data should just
be stored once, with the smallest possible reference to it from other places
that need to relate to the base data.

More than likely you'll eventually have to move to an Autonumber primary key
there for the above listed reasons. Most of us encounter this situation at
least once, and from that point forward we use Autonumber primary keys,
since fixing the problem once it has developed is much more of a pain than
preventing it.

Hope this clarifies;

-Amy
 
G

Guest

Perhaps I will rephrase my question.

What are the dangers of using an autonumber field as the code for a code
values?
eg can the autonumber field get set to a differnet value if I have to
re-load the table. If it can be guaranteed to be static then I have no
problem with using it as a primary key eg for an employee id but if its not
there seem to be some dangers to using it as such.

Certainly a persons name can never be a primary key - too many John Smith's
out there but a key on surname is useful for an ordered lookup.

Bulk updates are not a good argument for not using a field(attribute) as a
key as they should be performed in non-prime time to minimise impact.

Normalisation is always the goal and there will always be some
fields(attributes) in the table that can uniquely define a row.
 
P

peregenem

Amy said:
I'm not sure what you mean by "re-load" the table.

If the OP meant this ...

CREATE TABLE Employees (
employee_ID COUNTER,
last_name VARCHAR(35) NOT NULL,
first_name VARCHAR(35) NOT NULL
);

INSERT INTO Employees (last_name, first_name)
VALUES ('Smith', 'John');
-- John Smith gets employee_ID = 1

DELETE FROM Employees;

INSERT INTO Employees (last_name, first_name)
VALUES ('Smith', 'John');
-- The same John Smith gets employee_ID = 2

.... then they are correct: an autonumber can never be a true key
because the same entity gets a different key value depending on when
they were entered into the system (relative order of INSERT).
 
P

peregenem

BruceM said:
What would you do to guarantee uniqueness in a Contacts table or some such
involving names and addresses

For a Contacts table, last_name, first_name and postal_address makes a
fine natural key (assuming you can uniquely identify addresses <g>).
The chances that someone with the same name living at the same address
*is* the same person are very high. If they are different, then the
chances of them being related, and hence being in contact with the
intended person themselves, are high again. Adding an autonumber to
this Contacts table is not going to help you resolve this situation.
You'd have to tell them they are ContactID=1 and every time you
contacted them you'd have to check their ContactID to ensure you
weren't addressing their eponymous grandfather... unless they'd
divulged their ContactID. Anyhow, in doing so you'd have to 'expose'
the autonumber value and even the regulars who advocate autonumbers
will tell you this is taboo. Keys are all about ... what's the word
here? ... trust, security, etc. For a Contacts table, name and address
are good enough because they consequence of getting the wrong person
aren't all that bad (hey, maybe the granddad will buy your product
<g>). Higher levels of trust/security are requires different
information to be stored/issued: pin numbers, favourite question and
answer, mother's maiden name, 'An email has been sent...reply or follow
the link...', a personal appearance plus ID, photo ID, fingerprints,
retina scan, etc. Autonumber does not help identify an entity in
reality (in the data model), it can only be used internally (in the
database).
in light of the fact that names and addresses
are subject to change?

Who says a key can't change? What do you think ON UPDATE CASCADE is
for?
 
P

peregenem

BruceM said:
Suppose I wanted to create a clustered index in an Access table. How would
I do that?

That's it! You've hit on the golden question. You create a clustered
index by using the PRIMARY KEY declaration. There is no other way to
create a clustered index in Access/Jet.

If you want a non-nullable unique CONSTRAINT, you use NOT NULL UNIQUE.
If you want a non-nullable unique clustered INDEX, you use PRIMARY KEY.
CONSTRAINTs are all about data integrity (logical). INDEXes are all
about performance (physical).
The term does not appear in Access Help, and discussions of the
subject tend to assume the reader knows what a clustered index is and how to
create one.

There is info out there but it is easy to miss. One view is that there
is no 'choice' for a table's clustered index, it's either PK order (PK
exists) or data/time order (no PK exists). In SQL Server, for example,
you can explicitly specify NONCLUSTERED. In Access/Jet, CLUSTERED is
implicit, default and compulsory i.e. comes as standard with PK every
time even if you don't want it. The point is, for an autonumber you
*don't* want it.

Here's a couple of relevant articles you may have missed:

ACC2000: Defragment and Compact Database to Improve Performance
http://support.microsoft.com/default.aspx?scid=kb;en-us;209769

New Features in Microsoft Jet Version 3.0
http://support.microsoft.com/default.aspx?scid=kb;en-us;137039
Even if one is created, what benefits will I notice?

What are the benefits? Improved performance, especially with queries
that can take advantage of physically contiguous rows e.g. GROUP BY or
BETWEEN constructs. That is assuming you've chosen the PK
appropriately. Conversely, if you've chosen unwisely, e.g. you've made
you autonumber column the PK, you will take a performance hit. Will you
notice? There are too many factors to generalize; you must test. With a
table of 100 rows, I doubt you would be able to *measure* any
performance difference :)
Regarding John Doe, it may well be a name used by more than one person. How
does this fit in with clustered indexes? I may need duplication in that
field.

I suppose an autonumber could help you out here i.e. you only need
(last_name, first_name) for you clustered index but you need to satisfy
the UNIQUE attribute that PRIMARY KEY requires. Note the ordinal
position of the columns in the PRIMARY KEY declaration are significant

CREATE TABLE Blah (
first_name VARCHAR(35) DEFAULT '{{NK}}' NOT NULL,
last_name VARCHAR(35) DEFAULT '{{NK}}' NOT NULL,
.... (other columns) ...
uniquifier IDENTITY (1,1) NOT NULL,
PRIMARY KEY (last_name, first_name, uniquifier)
);

.... However, the autonumber is usually not required because there
should be a natural key i.e. attribute(s) which uniquely identify an
entity. So use the existing key at the end of the PK declaration. Using
an autonumber in place of (rather than in addition to) a natural key
will lead to pain sooner or later.
 
G

Guest

Amy,

By re-loading I mean empty/recreate the table and put the data back again.
Why would you do this? Perhaps recovery from corruption etc...

If you back up a table with an autonumber field and there are gaps in the
number sequence due to deletions what happens to the autonumber field if you
restore from this backup?
 
P

peregenem

Denis said:
By re-loading I mean empty/recreate the table and put the data back again.
Why would you do this? Perhaps recovery from corruption etc...

If you back up a table with an autonumber field and there are gaps in the
number sequence due to deletions what happens to the autonumber field if you
restore from this backup?

You can INSERT explicit values into an autonumber (COUNTER) column:

CREATE TABLE Test (
key_col COUNTER NOT NULL,
data_col INTEGER NOT NULL)
;
INSERT INTO Test (key_col, data_col) VALUES (2147483647,1);
INSERT INTO Test (key_col, data_col) VALUES (-2147483648,2);
INSERT INTO Test (data_col) VALUES (3);

So you can use an INSERT INTO...SELECT construct to reload your table
using explicit values for the autonumber.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top