Northwind Mystery

Tom Ellison · May 11, 2006

In the Northwind database, there are several indexes for which I do not
understand the purpose.

In the Products table, there are indexes on CategoryID and SupplierID. For
what purpose are these created? I haven't seen anywhere they are used. If
there is a reason for them, I'd like to know.

These are indexes of surrogate key values that are the autonumbers in other
tables.

Thanks.

Tom Ellison

Norman Yuan · May 12, 2006

Those fields (CategoryID, SupplierID) are the foreign keys, which, along
with primary keys, are used to establish relationships between tables.
Primary/foriegn key is a very basic concept of relational database. One must
understand these basic things in order to do some meaningful work with
relational database system.

Tom Ellison · May 12, 2006

Dear Norman:

OK. I won't recite my credentials, but I'm a senior developer, and I'm
extremely familiar with these concepts. There MUST be a unique index on the
column(s) of a relationship on the ONE side of the relationship. But why
index them on the MANY side of the relationship? They certainly aren't
unique there!

I have generally not seen an index on the related columns placed in the
table on the MANY side!

I suppose my question presumes some familiarity with this common database
example.

Thanks for offering your suggestion, Norman!

Tom Ellison

John Vinson · May 12, 2006

In the Products table, there are indexes on CategoryID and SupplierID. For
what purpose are these created? I haven't seen anywhere they are used. If
there is a reason for them, I'd like to know.

These are indexes of surrogate key values that are the autonumbers in other
tables.

They're not "surrogate" key values - they are foreign keys. For
example, there is an enforced Relationship defined between the primary
key of Categories and the corresponding foreign key CategoryID in the
Products table; the way that Access enforces this relationship is...
tadaa!... with an Index.

John W. Vinson[MVP]

Tom Ellison · May 12, 2006

Dear John:

An autonumber used for a relationship between two tables is certainly what
I'd call a surrogate key. If you have another definition, I'd be glad to
here it. But that's not the point here.

The question is, why would there be an index on the CategoryID column in the
Products table, that is, in the table on the MANY side of the relationshkp.
I know there must be a unique index on the key column on the ONE side of the
relationship. I have created many one-to-many relationships, but having an
index on the key column on the MANY side has never been needed. Don't you
concur?

Which brings me back to the question. Why index the CategoryID and the
SupplierID in the table on the MANY side of the relationshkp, the Products
table?

Thanks.

Tom Ellison

Albert D.Kallal · May 12, 2006

Tom Ellison said:
In the Northwind database, there are several indexes for which I do not
understand the purpose.

In the Products table, there are indexes on CategoryID and SupplierID.
For what purpose are these created? I haven't seen anywhere they are
used. If there is a reason for them, I'd like to know.

Well, you are correct, they are not really needed, and the database will
function without them.

However, if you plan run any report based on products, and want to select
(filter) only particular products of a given category, then a index would
make a very large difference in performance. The same idea applies to the
CatagryID.

If I filter, or want to find a particular product by catagoryID, then a
index would again make a VERY large difference in performance.

As a side note, any time you create a field name of number type, and the
field name ends with "ID" , ms-access will default to indexing that column
(so, actually, you have to be careful to not accident create indexes when
you don't need them!)

I have to say for the most part, that indexing these types of "catgalogry"
fields is a very good idea from a performance point of view.

So, likely, they are there because when you enter field names that end in
id...ms-access will automatic default to indexed (so, perhaps someone did
nothing here (lazy), .and just let ms-access default to having a index).

However, from a sorting, or selecting point of view...having a index here
will make a large difference in performance....

Tom Ellison · May 12, 2006

Dear Albert:

Now, that's some good stuff! Thanks!

I certainly didn't know Access would create indexes based on column names.
That seems weird indeed. But it probably explains why these indexes are
there. Whoever built Northwind may not have known there were there.

Isn't it true that having extra indexes can also be a bad thing? It takes
time to change the index whenever a new row is added, a row is deleted, or
when the subject column is updated. This is NOT so good for performance.

Your suggestion that this column might well be filtered in a query or form
is quite reasonable. The amount of performance gained with an index would
depend on what proportion of the table is filtered. If a filter includes
80% of the table, then a table scan is not so inefficient - you're going to
have to read 80% of the rows anyway. If the filter includes 1% of the
table, an index could be a big boost. Makes sense here.

To put this into Northwind without any explanation doesn't tend to serve the
rationale behind Northwind, to teach and give examples. I could not learn
this from Northwind, but fortunately there is Albert.

Now, the question is, are these indexes used somewhere in Northwind,
justifying their existence? Well, you might come along and filter by just
about any column in the database, right? So, let's start out with all the
columns indexed, just for good measure. Somehow, that's the impression I'm
getting here. Not really a good idea.

I'll be sure none of my numeric columns ends in ID. That's for sure. I
like to control the design of my tables, not have it taken from me.

Tom Ellison

Arvin Meyer [MVP] · May 12, 2006

Tom Ellison said:
Dear Albert:

Now, that's some good stuff! Thanks!

I certainly didn't know Access would create indexes based on column names.
That seems weird indeed. But it probably explains why these indexes are
there. Whoever built Northwind may not have known there were there.

I'll be sure none of my numeric columns ends in ID. That's for sure. I
like to control the design of my tables, not have it taken from me.

Hi Tom,

Not just ID, Access automatically indexes fields ending in:

ID; key; code; num

unless you disable this in Options.
--
Arvin Meyer, MCP, MVP
Microsoft Access
Free Access downloads
http://www.datastrat.com
http://www.mvps.org/access

Jeff Conrad · May 12, 2006

Hi Tom,

I'll be sure none of my numeric columns ends in ID. That's for sure. I like to control the
design of my tables, not have it taken from me.

If you go to Tools | Options | Tables/Queries
look for the text box called "AutoIndex on Import/Create:"
You'll see the default settings that Access uses to automatically
create indexes without your permission. On a default installation
you can see that Access has four entries in that text box. Any field
names that end with ID, key, code, or num will be "graciously"
indexed automatically for you.
<g>
--
Jeff Conrad
Access Junkie - MVP
http://home.bendbroadband.com/conradsystems/accessjunkie.html
http://www.access.qbuilt.com/html/articles.html

in message:

Albert D.Kallal · May 12, 2006

I certainly didn't know Access would create indexes based on column names.

That seems weird indeed. But it probably explains why these indexes are
there. Whoever built Northwind may not have known there were there.

You can *see* this then you are in table design mode...if you type in a
field name..and choose the type as number *and* the field name ends in
id...notice how the "index" property is set...you can if you please at this
point set the field to no index. So, this is NOT a invisible
occurrence...but just simply a default that access has, and you can EASILY
see this during the field design/creating process when in table design mode.

Isn't it true that having extra indexes can also be a bad thing? It takes
time to change the index whenever a new row is added, a row is deleted, or
when the subject column is updated. This is NOT so good for performance.

Yes...have to 100% agree. Just throwing up a index on every field with blind
eyes is just as bad (or worse!!) as a approach of not having any indexes at
all!!

So, that default is just a guess that MOST fields that are number type, and
end in ID likely are VERY good candidates to be indexed...but, not always...

This decision to index or not....often comes down to the black art of
performance. For example, if we were NEVER to search, or build reports by
productID, or category, then those indexes would most certainly be not
needed, and would be a performance liability here. (however, those fields
are part of relationships...and thus they *really* do need to be indexed!).

In the case of relational data joins, again the query processor can, and
*often* will use the index on the foreign key field.

Having met you in the past, I am aware that a considerable amount of your
work is with sql server. It turns out that missing a index here or there
when using sql server is not so bad from a network point of view. If you
search for a productID with sql server, and no index is present, then sql
server will perform a table scan to get that match. However, during this
scan, NO NETWORK LOAD occurs. Only AFTER sql server finds the records, does
it send them down the wire. However, the sql server disk drive certainly did
get a workout..and more load was placed on the server.

So, often missing a index here, or there does not really effect *NETWORK*
traffic and performance very much with sql server.

With a file share/jet on a network, then this is a much more critical issue.

So, sql server can do the search locally, and THEN send the records. With a
file share/jet system, that index is REALLY important, since without the
index, all records will travel across the network when searching. With a
index, only those matching records will travel. down the network wire.

So, good indexing is much more needed when you don't have sql server. In
fact, you have to be MORE aggressive in the use of indexes for a file share
as compared to sql server...

is quite reasonable. The amount of performance gained with an index would
depend on what proportion of the table is filtered. If a filter includes
80% of the table, then a table scan is not so inefficient - you're going
to have to read 80% of the rows anyway. If the filter includes 1% of the
table, an index could be a big boost. Makes sense here.
Yup..agreed...

Now, the question is, are these indexes used somewhere in Northwind,
justifying their existence? Well, you might come along and filter by just
about any column in the database, right? So, let's start out with all the
columns indexed, just for good measure. Somehow, that's the impression
I'm getting here. Not really a good idea.

Well, they did not index all of the fields. Another *Very* important point
here is that those fields are part of a relationship.

If you build a form based on Suppliers, and then have a sub-form to
display/show all of the products, how will access retrieve ONLY just the
records that belong to a given SupplerID? We load one suppler...but have a
sub-form of products. In this case, with a with NO index on the
[Products].[SupplerID] field, then table scans will occur. So, a index is
very impoarant here.

In addition to sub-forms, the issue of building a relational sql join from
Suppliers to products also *really* needs a index to perform a join with any
amount of performance. We might restive two Suppliers...and want to see all
of their products. JET (and sql server) can thus restive the two
records..and then based on the index field...grab the child records here
(say, the 12, or 15 products). Without a index on the products.SupplerID
field....jet (or sql server) can't just retrieve the few needed records.

In fact, one the BEST performance gains you can get in ms-access is to
ensure that ALL Forign key fields are indexed for relations. I index ALL of
these in my applications.

I'll be sure none of my numeric columns ends in ID. That's for sure. I
like to control the design of my tables, not have it taken from me.

As mentioned, this is only a default presented *during* table design mode.
You can see/choose the index setting. So, it is more a default during design
mode, and NOT a hidden default that you don't know about...(

Try creating a test mdb file...and start a new table in design mode. .I
believe access also scans for "code" and a few other types of keywords that
will "default" the index setting to change from none to "indexed:....but,
note it is ALWAYS in plain view to see, and this ONLY occurs when you are
adding a new field.........it is never done under the covers hidden...

Tom Ellison · May 12, 2006

Dear Albert:

Well, you are right on, as usual. Mostly, that is.

Now, I do not agree that having an index on these columns in the table on
the MANY side of the relationship are needed. I rather doubt that's what
you meant, however.

Your comments about how this works with SQL Server are quite appropriate. I
try hard not to create table scans there. There are usually a good number
of indexes in that work.

Because the SQL Server applications tend to have more users, and more
traffic, the need for indexing is probably greater. The analyzer tools help
point this out for us. If all things were equal, then you're right, this
would be more critical for Jet.

Finally, here's the nasty part. You can, of course, have only one Primary
Key. When the database is Compact/Repaired, the rows are put in the order
of that PK.

When you build a filter for a form, possibly based on its controls, you
would be filtering by the natural key. When you have a form/subform setup,
and you navigate successively through the FK table, the dependent table is
accessed successively in the natural key order, that is, the order of the
foreign key table in its form. Most reports would likely follow the same
path. The more so for my applications, where I commonly have a continuous
form for the FK table as well as the dependent table (I rewrite the SQL for
the subform using the Current event of the FK table's form, and assign a new
RecordSource each time).

Now, if the full natural key is placed into the dependent table, and is its
primary key, then when the database is compacted you are actually traversing
the dependent table in its physical sequence, which is very, very good for
performance. Again, the same thing happens in reports. If instead you have
an autonumber key for the FK table, and that is in the dependent table
instead of the natural key from the FK table, then there is no opportunity
to put the rows of the dependent table in natural key order, and the disk
access will be random instead of sequential. Considering that the segments
of the database are 4K, this means that you will be reading all over the
database frequently, instead of processing 20 to 50 rows or so sequentially
before another HD access. Remember, each access takes 10 mS or so, and they
really add up when compared to processing what is in memory.

I'm creating a version of Northwind without autonumbers, but just all
natural keys. I'll do some extensive side-by-side testing of NW as is with
my natural key version and try to demonstrate what happens. I've seen this
debated enough. I don't want to phony up some database that just makes this
test appear one way or the other. I've picked NW as a "standard" not of my
own making. So, knowing why these indexes are there is important, so we can
compare apples to apples.

Thanks for your contribution, Albert. Hope to be seeing you soon!

Tom Ellison

Albert D.Kallal said:
I certainly didn't know Access would create indexes based on column
names. That seems weird indeed. But it probably explains why these
indexes are there. Whoever built Northwind may not have known there were
there.

Click to expand...

You can *see* this then you are in table design mode...if you type in a
field name..and choose the type as number *and* the field name ends in
id...notice how the "index" property is set...you can if you please at
this point set the field to no index. So, this is NOT a invisible
occurrence...but just simply a default that access has, and you can EASILY
see this during the field design/creating process when in table design
mode.

Isn't it true that having extra indexes can also be a bad thing? It
takes time to change the index whenever a new row is added, a row is
deleted, or when the subject column is updated. This is NOT so good for
performance.

Click to expand...

Yes...have to 100% agree. Just throwing up a index on every field with
blind eyes is just as bad (or worse!!) as a approach of not having any
indexes at all!!

So, that default is just a guess that MOST fields that are number type,
and end in ID likely are VERY good candidates to be indexed...but, not
always...

This decision to index or not....often comes down to the black art of
performance. For example, if we were NEVER to search, or build reports by
productID, or category, then those indexes would most certainly be not
needed, and would be a performance liability here. (however, those fields
are part of relationships...and thus they *really* do need to be
indexed!).

In the case of relational data joins, again the query processor can, and
*often* will use the index on the foreign key field.

Having met you in the past, I am aware that a considerable amount of your
work is with sql server. It turns out that missing a index here or there
when using sql server is not so bad from a network point of view. If you
search for a productID with sql server, and no index is present, then sql
server will perform a table scan to get that match. However, during this
scan, NO NETWORK LOAD occurs. Only AFTER sql server finds the records,
does it send them down the wire. However, the sql server disk drive
certainly did get a workout..and more load was placed on the server.

So, often missing a index here, or there does not really effect *NETWORK*
traffic and performance very much with sql server.

With a file share/jet on a network, then this is a much more critical
issue.

So, sql server can do the search locally, and THEN send the records. With
a file share/jet system, that index is REALLY important, since without the
index, all records will travel across the network when searching. With a
index, only those matching records will travel. down the network wire.

So, good indexing is much more needed when you don't have sql server. In
fact, you have to be MORE aggressive in the use of indexes for a file
share as compared to sql server...

is quite reasonable. The amount of performance gained with an index
would depend on what proportion of the table is filtered. If a filter
includes 80% of the table, then a table scan is not so inefficient -
you're going to have to read 80% of the rows anyway. If the filter
includes 1% of the table, an index could be a big boost. Makes sense
here.
Yup..agreed...

Now, the question is, are these indexes used somewhere in Northwind,
justifying their existence? Well, you might come along and filter by
just about any column in the database, right? So, let's start out with
all the columns indexed, just for good measure. Somehow, that's the
impression I'm getting here. Not really a good idea.

Click to expand...

Well, they did not index all of the fields. Another *Very* important point
here is that those fields are part of a relationship.

If you build a form based on Suppliers, and then have a sub-form to
display/show all of the products, how will access retrieve ONLY just the
records that belong to a given SupplerID? We load one suppler...but have a
sub-form of products. In this case, with a with NO index on the
[Products].[SupplerID] field, then table scans will occur. So, a index is
very impoarant here.

In addition to sub-forms, the issue of building a relational sql join from
Suppliers to products also *really* needs a index to perform a join with
any amount of performance. We might restive two Suppliers...and want to
see all of their products. JET (and sql server) can thus restive the two
records..and then based on the index field...grab the child records here
(say, the 12, or 15 products). Without a index on the products.SupplerID
field....jet (or sql server) can't just retrieve the few needed records.

In fact, one the BEST performance gains you can get in ms-access is to
ensure that ALL Forign key fields are indexed for relations. I index ALL
of these in my applications.

I'll be sure none of my numeric columns ends in ID. That's for sure. I
like to control the design of my tables, not have it taken from me.

Click to expand...

As mentioned, this is only a default presented *during* table design mode.
You can see/choose the index setting. So, it is more a default during
design mode, and NOT a hidden default that you don't know about...(

Try creating a test mdb file...and start a new table in design mode. .I
believe access also scans for "code" and a few other types of keywords
that will "default" the index setting to change from none to
"indexed:....but, note it is ALWAYS in plain view to see, and this ONLY
occurs when you are adding a new field.........it is never done under the
covers hidden...

--
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
(e-mail address removed)
http://www.members.shaw.ca/AlbertKallal

John Vinson · May 12, 2006

Dear John:

An autonumber used for a relationship between two tables is certainly what
I'd call a surrogate key. If you have another definition, I'd be glad to
here it. But that's not the point here.

Well, yes - it's a surrogate key in the "one" side table. Sorry!

The question is, why would there be an index on the CategoryID column in the
Products table, that is, in the table on the MANY side of the relationshkp.
I know there must be a unique index on the key column on the ONE side of the
relationship. I have created many one-to-many relationships, but having an
index on the key column on the MANY side has never been needed. Don't you
concur?

Which brings me back to the question. Why index the CategoryID and the
SupplierID in the table on the MANY side of the relationshkp, the Products
table?

I'm not certain just HOW the JET database engine uses these indexes in
the process of enforcing relationships, but that's my understanding.
If you use the Relationships Window (or even the Lookup Wizard) to
relate two tables, you must have the unique index already set in the
"one" table; the program will create a nonunique index in the "many"
table as part of the process.

Again, I'm not privy to the details of the code so I can't say why;
but I can speculate that it's more efficient to determine whether an
orphan record would be created by joining two indexes.

John W. Vinson[MVP]

Tom Ellison · May 12, 2006

Dear John:

You can drop these indexes and the relationships survive just fine. They
are not needed for a relationship. A UNIQUE index on the One side is
required, but, the best I understand, no index is required on the MANY side
table. That makes sense to me.

I believe Albert has exposed this accurately. Access can be creating these
automatically. Now that he mentions it, I'm just sure I've seen it. Never
knew why, though.

Tom Ellison

Wolfgang Kais · May 12, 2006

Hello Tom.

Tom said:
In the Northwind database, there are several indexes for which I do
not understand the purpose.

In the Products table, there are indexes on CategoryID and SupplierID.
For what purpose are these created? I haven't seen anywhere they are
used. If there is a reason for them, I'd like to know.

These are indexes of surrogate key values that are the autonumbers in
other tables.

Indeed, these two indexes you see in the indexes window from the table
design window are unnecessary, they even are redundant.
Unlike SQL Server, Jet always creates "foreign" indexes when you create
foreign key contraints (relationships with referential integrity enabled).
These indexes aren't listed in the indexes window, but you'll find them
using vba (dao) or using the database documentor (Tools, Analyze).
They should be named CategoriesProducts and SuppliersProducts.

Tom Ellison · May 12, 2006

Dear Wolfgang:

Thanks very much. Although I hadn't understood such indexes were being
created automatically, as Albert also explained, I had strong doubt they are
unnecessary.

I'm working on a paper to critique Northwind. It seems that many Access
users have studied Northwind, and it's held up as an example of good
practice. I propose to provide a version of Northwind that is more worthy
of that. I appreciate your help in doing this.

Tom Ellison

Albert D.Kallal · May 12, 2006

Finally, here's the nasty part. You can, of course, have only one Primary
Key. When the database is Compact/Repaired, the rows are put in the order
of that PK.

Yes, they are. And, unfortunately, as a design criteria, in those child
tables...I *always* include a primary key autonumber field
(before even reading farther...I know where this is going!!).

When you build a filter for a form, possibly based on its controls, you
would be filtering by the natural key. When you have a form/subform
setup, and you navigate successively through the FK table, the dependent
table is accessed successively in the natural key order, that is, the
order of the foreign key table in its form.

Without question, we now have a dilemma. It would seem *better* to NOT have
a primary key (autonumber) in the child table, but ONLY have the foreign key
value..and, if we make it the index, then the child table will be
"clustered" for us (to steal a sql server term). In a sense, just like sql
server can only have one clustered index....we kind of have the same thing
in jet (however, with sql server, you can have both a primary key, and the
clustered index does NOT have to be this PK).

If instead you have an autonumber key for the FK table, and that is in the
dependent table instead of the natural key from the FK table, then there
is no opportunity to put the rows of the dependent table in natural key
order, and the disk access will be random instead of sequential.
Considering that the segments of the database are 4K, this means that you
will be reading all over the database frequently, instead of processing 20
to 50 rows or so sequentially before another HD access. Remember, each
access takes 10 mS or so, and they really add up when compared to
processing what is in memory.

*Excellent* observation on your part!!! (just great!!).

Ok, so, now I going to give my snottily answer to the fact that I *usually*
include a PK field (autonumber) in that child table!!!
I have a PK in those child tables for two reasons

1) - well, obviously, if this table needs a child table...then I
will need a PK
(so, for flexibility, and future design consideration...I have a
pk)
2) - in invoice type stuff, I often want to retrieve the *last*
invoice date...and that also includes the case where a person might have
more then ONE invoice in the same day...so, I want the last one...the only
way to get this is grab the top 1 date....but throw in a "order by ID desc".
With the addition of the autonumber field..then each row is unique.
3) not in love with CODD..but I do believe that each table should
have a unique key to identify it.

However, as you point out, the instant I throw in that autonumber PK in this
child table, then I mess up the clustering I would normally get if I just
had a index on the FK field..and NO autonumber....

However, it turns out that *much* of the time, you do actually get the child
table in the correct order. For example, take a order taking application,
and I start to type in order details (obviously in a sub-form that is a
child table). It turns out that in *MOST* cases, I will be adding those
child tables one after another. So, while I don't get clustering by FK
order, the PK order is *useally* the same!!! In other words, if I add 5 new
child records, they all have the same FK, but the PK is actually also
auto-numbered one after another. Thus even after compacting...my frame/data
order is in fact clustered by PK value order, but they WERE entered in that
order!! Thus OFTEN the PK order will group values together by FK values.(at
least the times when I entered data in a sub-form in one shot...and that
occurs a LOT of the time!!).

Of course, this luck of the same order is not the same all the time. For
example, adding a new customer invoice would happen over time. I do loose
out on clustering in that case. However, for each invoice I create, and
other types of data entry, surprisingly often, the PK order follows the FK
order very close.

So, I can try and make a case to point out that having a PK is not ideal in
a child table...but, it is also not as bad as one might think!!

Guest · May 12, 2006

Tom,

I'm working on a paper to critique Northwind. It seems that many Access
users have studied Northwind, and it's held up as an example of good
practice.

While many Access users, including myself, have studied Northwind, I've
never heard of anyone holding it up as a example of good practices. It's
simply convenient to use in examples, since this database is so widely
available. The fact is that there simply are not m(any) high quality Access
samples put out by Microsoft. A lot of them have Option Explicit missing in
one or more code modules, some use reserved words, naming conventions are
usually not used, duplicate indexes, etc. etc.

I propose to provide a version of Northwind that is more worthy
of that.

That shouldn't be too hard. However, you might want to provide two versions
that are more worthy. One that has only the natural keys that you have such a
professed love of using, and another copy for those of us who choose to use
pseudo keys (autonumber). I'm not going to get in that religous argument with
you, other than to say I am on the pseudo key side of the fence. From your
past posts, I already know that we are on opposite sides of the fence on this
issue.

Wolfgang is 100% correct in his statements. Here are some statements made by
author and MVP John Viescas on this subject that you might want to check out:

http://groups.google.com/group/micr..._frm/thread/a61c485120605fb9/cfb61e61c63c7b35

http://groups.google.com/group/micr..._frm/thread/ad42632ff4ab30bb/08fdb992fef47676

and

http://groups.google.com/group/micr..._frm/thread/8a2e6a43362d23cd/c11330eaafa3bf45

In answer to your initial question about why the duplicate indexes are
present, here is an short tip that I wrote a few years ago that explains the
situation:

http://www.access.qbuilt.com/html/gem_tips.html#AutoIndex

Tom Wickerath, Microsoft Access MVP

http://www.access.qbuilt.com/html/expert_contributors.html
http://www.access.qbuilt.com/html/search.html
__________________________________________

John Vinson · May 12, 2006

You can drop these indexes and the relationships survive just fine. They
are not needed for a relationship. A UNIQUE index on the One side is
required, but, the best I understand, no index is required on the MANY side
table. That makes sense to me.

Interesting! No, I was not aware of that. Thanks!

John W. Vinson[MVP]

Tom Ellison · May 12, 2006

Dear Albert:

Your analysis is following my concepts very, very well.

Now, not every application we write is order entry. Consider what happens
in an inventory application. The foreign key table is the list of all
inventroy items, the dependent table is the "activity" on that item,
purchases, sales, shrinkage, etc.

In this case, the rows in the dependent table are not grouped together in
time, and therefore not in their numbering. Doesn't this blow away the
organization you would have if the autonumbers were assigned "close together
in time"? And isn't this an example of what happens in most table
relationships in most applications written?

As I understand the analysis you've undertaken (and you seem to have lept
gracefully to the conclusion before I every tried to make it) then having
the natural key be the PK would be the solution. This does not preclude
having a surrogate autonumber key, and making it a unique key of the table,
which also allows it to be used in relationships. At this point, the
discussion is not whether surrogate key relationships are superior or not.
It is just about which key should be the PK.

This presumes you would typically have BOTH the identity key and a unique
natural key. That's OK, isn't it? So, which should be the PK? I do not
believe you lose anything making the natural key the PK, and you would often
gain significantly. Is that what your analysis tell you, too?

Thanks very much for the discussion. I'll heat up some more vegetable soup
for you, and pour a glass of wine. (What a fine, stimulating evening that
was!)

See you again at a Summit I hope!

Tom

Tom Ellison · May 12, 2006

Dear Tom:

I propose to demonstrate the results of the two different database design
methods, with and without autonumber surrogate keys, testing performance
under conditions of inserts, updates, deletes, and straight SELECT queries.
I chose Northwind as the basis for this. I will soon have two copies the
table, indexes, and relationships, each built on the different paradigms.
I'll create a large amount of randomly generated data and insert it in each
database, then run timings of various operations.

Tom Ellison

Too Many Indexes Bug	5	Apr 16, 2010
Northwind as an example	8	May 12, 2006
Sales & Purchase	1	May 24, 2010
Simple questions - compiling etc	1	Oct 4, 2005
can't alter filed values in select query	3	Dec 16, 2006
File Size Spiralling Out of Control (Continued)	1	Sep 23, 2008
Convert a Number field to Autonumber?	15	Jan 17, 2009
Coding to create linked tables	5	May 15, 2006

Northwind Mystery

Tom Ellison

Norman Yuan

Tom Ellison

John Vinson

Tom Ellison

Albert D.Kallal

Tom Ellison

Arvin Meyer [MVP]

Jeff Conrad

Albert D.Kallal

Tom Ellison

John Vinson

Tom Ellison

Wolfgang Kais

Tom Ellison

Albert D.Kallal

Guest

John Vinson

Tom Ellison

Tom Ellison

Ask a Question

Similar Threads