How do you mulitply in a field?

B

BruceM

Jamie Collins said:
We have a minor language issue. You seem to use "relationship" to mean
"Access Relationship with RI enforced", whereas I would use "FOREIGN
KEY" or perhaps "referenced table" and "referencing tables" in
context. FWIW I model a relationship using a table which I call a
"relationship table" which IIRC you call a "junction table".

I don't necessarily mean with RI enforced, but in most table relationships
that is the case. A join in a query may not have RI enforced. When I refer
to a table relationship I am using Access terminology to mean just what you
surmise.

Let's say tblStudent contains student information, and tblCourse contains
Course information. Each Student can be in many Courses, and each Course
contains many Students, so tblStudentCourse resolves the many-to-many
relationship. I call tblStudentCourse a junction table. A junction table
in my lexicon is this specific type of table.
So, I'm assuming you are referring to a FOREIGN KEY (my term). You are
correct that the NULL value here is a complication and one worth
avoiding by design. In Access/Jet, NULL and non-NULL values in the
referencing table must match exactly in the referencing table by
treating the NULL value as if it were an actual value; while this is
not strictly the correct way to handle the NULL value, it has utility
because the non-matching non-NULL values will cause the FK to bite,
also altering the NULL value to a non-NULL value in the referencing
table will result in the new value being CASCADEd to the referencing
table (assuming the ON UPDATE CASCADE referential action has been
specified for the FK). Contrast this with SQL Server which will not
test any FK rows involving the NULL value (i.e. the non-NULL values
may not match but the still FK will not bite as it would in Access/
Jet) and changing from the NULL value to a non-NULL value does not
CASCADE; arguably more correct but has less utility. I'm finding it
hard to describe long hand so here's an example (sorry its ANSI-92
Query Mode and includes the dreaded DDL):

CREATE TABLE Test1
(
col1 INTEGER NOT NULL,
col2 INTEGER, UNIQUE (col1, col2)
)
;
CREATE TABLE Test2
(
col1 INTEGER NOT NULL,
col2 INTEGER,
FOREIGN KEY (col1, col2)
REFERENCES Test1 (col1, col2)
ON DELETE CASCADE
ON UPDATE CASCADE
)
;
INSERT INTO Test1 (col1, col2)
VALUES (1, NULL)
;
INSERT INTO Test2 (col1, col2)
VALUES (1, NULL)
;

Consider this query:

SELECT *
FROM Test1 AS T1
INNER JOIN Test2 AS T2
ON T1.col1 = T2.col1
AND T1.col2 = T2.col2;

returns no rows because T1.col2=T2.col2, NULL=NULL, causes the row to
be removed from the resultset; put crudely, you can't compare the NULL
value to anything, even the NULL value.

I cannot get the CREATE TABLE stuff to work, even though I am using the
correct query mode. It expects a line number or a label or a few other
things. However, I think I see what you are getting at.
Now consider this:

DELETE FROM Test1;

causes the row in Test2 to be removed; Access/Jet successfully matched
NULL=NULL. Impressive or no? In SQL Server, the row would remain (not
orphaned because it was never considered to be referencing in the
first place).

That's one of the reasons why I used a NOT NULL end_date when I posted
the example i.e. to avoid this complexity ...but you did ask :)
Regardless, I would expect any designer to use (employee_number,
start_date) as the key for referencing tables.


I've tried to show you tables with a PRIMARY KEY but no logical key,
tables where a PRIMARY KEY cannot prevent duplicate data, and tables
where PRIMARY KEY is used for purposes of physical indexing rather
than logical keys.

I am lost with the difference between PRIMARY KEY and primary key, although
I suspect you are referring to the difference between a logical key (the
combination of fields that assures the record is unique) and a key that is
used in relationships with other tables (since a dozen fields may be needed
to construct a suitable logical key). However, as I have explained several
times I ensure by one means or another that records are unique. I have
asked how one places a unique constraint on a combination of fields without
making them into an Access PK. I don't know much about using a database
engine other than Jet, so the differences with a SQL engine are lost on me.
If you have answered my question about multi-field keys, I could not
understand your response as an answer to the question.
When you choose to use PRIMARY KEY, the choice of what you use it for
it arbitrary; feel free to omit a PRIMARY KEY but be ready for someone
to challenge you on it (e.g. document your reasons) because there is
this 'touch stone' in SQL that every table should have a primary key/
PRIMARY KEY -- I really don't know what variety they mean! Personally,
I think every table should have a both a primary key and a PRIMARY
KEY, basing their choice for the latter on good (documented) reasons.
For Access/Jet, clustering is IMO the best reason for using PRIMARY
KEY but I do know that many people value that bold text in the Access
'Relationships' diagram.

Final point on this subject: if someone is unaware of Access/Jet's
clustering behaviour, can they really make an informed decision about
PRIMARY KEY designation?

No idea. I know that my databases work, and it's not just dumb luck.
Do I need to tell you that your approach doesn't actually solve the
problem at hand? The aim is to prevent this (aircode):

INSERT INTO SUVs (VIN, etc)
SELECT VIN, etc)
FROM Vehicles
WHERE vehicle_type = 'Sedan';

I prevent it by limiting the choices the user can make once SUV has been
selected.
and similar updates that are contrary to the business rules.

What's your position on PRIMARY KEY? If I showed you this table:

CREATE TABLE Employees
(
employee_number INTEGER
);

and pointed out that all nine front end programs known to use the
database have front end code to trap duplicates and NULLs, would you
think it was a good idea to omit a NOT NULL unique constraint from the
SQL table?

I would tend not to edit employee information in those databases. When I
need to edit the information I go to the database that contains that
information. That is to say, I have one way to edit the information.
Anyhow, I *never, ever* argued in favor of allowing nulls in required
fields. If I am using the Employee table to populate a combo box list from
which a name is selected, there is no way to select a Null value. I can
require that the field be filled in before the rest of the record can be
completed by means of front end code to enforce the fact that the employee
ID is required in the table. Or I can use validation at the table level,
although that is not my usual choice.
 
J

Jamie Collins

I have
asked how one places a unique constraint on a combination of fields without
making them into an Access PK. I don't know much about using a database
engine other than Jet, so the differences with a SQL engine are lost on me.
If you have answered my question about multi-field keys, I could not
understand your response as an answer to the question.

ALTER TABLE EmployeeSalariesHistory ADD
CONSTRAINT my_constraint_name
UNIQUE (employee_number, start_date);

This will work for Jet in ANSI-92 Query Mode Jet SQL. It also happens
to be standard SQL so has a very good chance of working in any given
SQL DBMS. You can do the same in the Access user interface by create
an index, choosing the fields and specifying 'allow duplicates =
no' (or similar; haven't got Access on this machine).
No idea. I know that my databases work, and it's not just dumb luck.

An mdb doesn't need indexes to work but they can improve performance.
Why miss out on the opportunity of a physically ordered index?

BTW when I say 'indexes' I mean the 'allow duplicates = yes' kind i.e.
the ones for performance and nothing else.
I prevent it by limiting the choices the user can make once SUV has been
selected.

I open up Excel, create a new MSQuery session, connect to your mdb,
and execute this from it's SQL window:

INSERT INTO SUVs (VIN, etc)
SELECT VIN, etc)
FROM Vehicles
WHERE vehicle_type = 'Sedan';

How has your application 'limited my choices'? I now have 'sedans' in
my SUVs table!
I *never, ever* argued in favor of allowing nulls in required

I think you missed my point.

You said earlier you would trust your front end controls to prevent
'sedans' from being entered into the 'SUVs' tables. Because I consider
this unacceptable myself (data integrity being a job for the data
layer IMO), I was wondering whether you think a simple unique
constraint (or PK if you like) can also be trusted to the front end
application or nine front end applications in this case. What if one
of those applications had a bug in their duplicate testing routine?
Personally, I'd want a constraint in the DBMS to trap such 'data'
bugs.

Jamie.

--
 
B

BruceM

Jamie Collins said:
ALTER TABLE EmployeeSalariesHistory ADD
CONSTRAINT my_constraint_name
UNIQUE (employee_number, start_date);

This will work for Jet in ANSI-92 Query Mode Jet SQL. It also happens
to be standard SQL so has a very good chance of working in any given
SQL DBMS. You can do the same in the Access user interface by create
an index, choosing the fields and specifying 'allow duplicates =
no' (or similar; haven't got Access on this machine).

OK, thanks. That led me on an investigation I might not have made
otherwise.
An mdb doesn't need indexes to work but they can improve performance.
Why miss out on the opportunity of a physically ordered index?

Who said I avoid indexes? I have used them on single fields only, but I can
see the value of indexing on a combination of fields (although maybe not
always a unique combination, for instance with FirstName and LastName
fields).
BTW when I say 'indexes' I mean the 'allow duplicates = yes' kind i.e.
the ones for performance and nothing else.


I open up Excel, create a new MSQuery session, connect to your mdb,
and execute this from it's SQL window:

INSERT INTO SUVs (VIN, etc)
SELECT VIN, etc)
FROM Vehicles
WHERE vehicle_type = 'Sedan';

How has your application 'limited my choices'? I now have 'sedans' in
my SUVs table!

I apply security so that the database can't be hacked.
I think you missed my point.

You said earlier you would trust your front end controls to prevent
'sedans' from being entered into the 'SUVs' tables. Because I consider
this unacceptable myself (data integrity being a job for the data
layer IMO), I was wondering whether you think a simple unique
constraint (or PK if you like) can also be trusted to the front end
application or nine front end applications in this case. What if one
of those applications had a bug in their duplicate testing routine?
Personally, I'd want a constraint in the DBMS to trap such 'data'
bugs.

When I select an Employee in one of those nine databases, the number only
gets stored. If I am selecting only one name (number), the duplicates issue
does not occur. If I am selecting subform records (attendees at a training
session or something) then I can see the point to a multi-field index in the
child table, consisting of the EmployeeID together with the table's key
field. However, I don't want it ever to get to the point of the user seeing
the built-in error message that results from accidentally selecting the same
name twice.

I'm glad finally to understand the distinction between multi-field index and
multi-field PK.
 
J

Jamie Collins

Who said I avoid indexes?

I'm trying to draw a parallel. Consider I pointed out to someone that
I thought they'd get better performance if they indexed certain
columns and they replied, "I know that my database works, and it's not
just dumb luck." Yes, 'works' trumps 'optimized' but why not have
both?
I have used [indexes for performance] on single fields only, but I can
see the value of indexing on a combination of fields (although maybe not
always a unique combination, for instance with FirstName and LastName
fields).

You put your physically ordered index -- arguably your best index ,
your uber-index -- on an autonumber column, which is unique *and*
meaningless, right? And you do this because you value the fact it
appears as bold text in the Relationship diagram to indicate that it
should be used in a foreign key and joins, right? It's your choice of
course, and it's an arbitrary one, and you can decide to ignore
clustering, as is your prerogative as designer. But another designer
who is *unaware* of clustering, can we say they've made an informed
decision?
I apply security so that the database can't be hacked.

Is this another case where I have to come up with an example that
exactly fits before you can see the point I'm trying to make? Can't
you cut me a little slack <g>? I'm not referring to malicious
attacks.

It is common for a user of an Access application to have privileges on
objects in the back end database and in my experience it is almost
inevitable that the objects will at some point be queried using
something other than your Access application; I'd suggest Excel is the
tool of choice for this (take a look in the Excel groups), followed
closely by linked tables in another mdb. You've never encountered
this? Never mind. It could be you have a way of ensuring that only
your Access application has privileges on the database; if so, please
share your techniques.

Allow me to me try another angle. Consider this VBA code in the
proverbial 'Vehicles' Access app (the one will all the exclusive
privileges, say):

' Create some dynamic SQL
strSQL = "INSERT INTO SUVs (VIN)" & _
strSQL = strSQL & " SELECT VIN) FROM Vehicles"

' Limit to SUVs only
sSQL = sSQL & " WHERE vehicle_type = 'SUV';"

' Update the table
Connection.Execute strSQL

Oops! The coder appended the search condition to the wrong variable,
one will a similar name.

If it was my database I'd want a validation rule at the data engine
level that would highlight the error to the coder pretty early on,
otherwise my database is open to 'attack' from any bug that gets
introduced into the front end(s).

In my land we call this, "Belt and braces," and I've heard Mr Celko
call it, "Mop the floor and fix the leak." But if you are only
prepared to do one thing then I suggest it should be done as close to
the data as possible i.e. engine-level constraints (FWIW a CLR rules
engine shared by SQL Server database engine and front end is
appealing).

And if you'd do entity type checking in the front end, why not do your
unique indexes and FKs in the front end too?!

Jamie.

--
 
B

BruceM

Jamie Collins said:
I'm trying to draw a parallel. Consider I pointed out to someone that
I thought they'd get better performance if they indexed certain
columns and they replied, "I know that my database works, and it's not
just dumb luck." Yes, 'works' trumps 'optimized' but why not have
both?

Did you notice I said I can understand the value of a multi-field index, now
that I understand the distinction you are making between an index and a
primary key (as the term is used in Access)? I do not know about clustering
behavior. I don't know if you are referring to PRIMARY KEY or primary key,
or what the distinction is. I can't make an informed decision about the air
pressure in my car's tires, either, because I don't understand the physics
involved. However, people who know about such things have suggested the
correct air pressure. I don't give it a lot of thought beyond that.
I have used [indexes for performance] on single fields only, but I can
see the value of indexing on a combination of fields (although maybe not
always a unique combination, for instance with FirstName and LastName
fields).

You put your physically ordered index -- arguably your best index ,
your uber-index -- on an autonumber column, which is unique *and*
meaningless, right? And you do this because you value the fact it
appears as bold text in the Relationship diagram to indicate that it
should be used in a foreign key and joins, right?

NO! You are projecting something I never so much as hinted at. I use it
because it doesn't change, and because I prefer the simplicity of a
single-field join where possible. This is not about the convenience of bold
type.
It's your choice of
course, and it's an arbitrary one, and you can decide to ignore
clustering, as is your prerogative as designer. But another designer
who is *unaware* of clustering, can we say they've made an informed
decision?


Is this another case where I have to come up with an example that
exactly fits before you can see the point I'm trying to make? Can't
you cut me a little slack <g>? I'm not referring to malicious
attacks.

Again, I apply security so that the database cannot be changed outside of
the parameters I choose as the developer. The security can be circumvented
by a determined and knowledgeable user, but for my situation it is adequate.
It never occurred to me that somebody would take the Excel or some such
similar route, which sounds to me like a skilled and determined attempt to
undermine the data integrity. If the user can get through the security, no
approach to indexing is going to help. The answer, I'm afraid, is that I
don't see your point.
It is common for a user of an Access application to have privileges on
objects in the back end database and in my experience it is almost
inevitable that the objects will at some point be queried using
something other than your Access application; I'd suggest Excel is the
tool of choice for this (take a look in the Excel groups), followed
closely by linked tables in another mdb. You've never encountered
this? Never mind. It could be you have a way of ensuring that only
your Access application has privileges on the database; if so, please
share your techniques.

Allow me to me try another angle. Consider this VBA code in the
proverbial 'Vehicles' Access app (the one will all the exclusive
privileges, say):

' Create some dynamic SQL
strSQL = "INSERT INTO SUVs (VIN)" & _
strSQL = strSQL & " SELECT VIN) FROM Vehicles"

' Limit to SUVs only
sSQL = sSQL & " WHERE vehicle_type = 'SUV';"

' Update the table
Connection.Execute strSQL

Oops! The coder appended the search condition to the wrong variable,
one will a similar name.

If it was my database I'd want a validation rule at the data engine
level that would highlight the error to the coder pretty early on,
otherwise my database is open to 'attack' from any bug that gets
introduced into the front end(s).

You know, I have actually tried to agree with you on some points. I
understand the value a multi-field index can have. I added that I want the
user to be informed by words of my choosing rather than default error
messages. I want to intercept the problem before then. The front end is
for the user's convenience, and I want it to be as convenient as is
possible.
In my land we call this, "Belt and braces," and I've heard Mr Celko
call it, "Mop the floor and fix the leak." But if you are only
prepared to do one thing then I suggest it should be done as close to
the data as possible i.e. engine-level constraints (FWIW a CLR rules
engine shared by SQL Server database engine and front end is
appealing).

See my previous comment.
And if you'd do entity type checking in the front end, why not do your
unique indexes and FKs in the front end too?!

Because there are no tables in the front end.
 
B

Brent Spaulding \(datAdrenaline\)

Agree? Not really, other than we all seemed to agree that a resultset
(my term) does not need to be normalized.

I am surprized you do not agree. A VIEW is a 'table' by SQL-92 definitions,
and a 'table' is 'a multiset of rows' by SQL-92 definition, and 'a multiset
of rows' is a 'resultset' ... admittedly the definition of a 'resultset' is
my interpretation based on context through out the thread. With that line
of thought, a VIEW is, by definition, a 'resultset', which does not need to
be normalized, as we agreed.

If you do not agree to that, then, how 'bout this ... A VIEW is, by SQL-92
definition is a 'derived table' ... and 'derived tables' should not be
forced to follow normalizton guidlines.
You seem to be saying that normalization can be applied to SQL objects
but only SQL base tables; I've agreed this can only be done
*informally*.

Yes ... I generally say statements like "My tables are normalized", (in the
context of that statement, I mean for 'table' to be analogous to 'base
table' from SQL-52
I'm saying that if you choose to apply normalization to SQL base
tables, as many people do in the groups, then you should apply it
equally to SQL viewed tables (VIEWs); as a practical approach I've
suggested that if your VIEW would not 'pass muster' as a normalized
table then alter it to become the resultset of a stored procedure and
that will prevent it being used as a table (being queried as a table,
SQL updates as a table) -- and you seem to disagree with all this!

It is my opinion that if an SQL Procedure returns a 'resultset', then its
return type is 'table', and since the 'resultset' is derived, it does not
need to be normalized.
Prehaps it's time to "agree to disagree" :)

As per norm, I am typically agreeable! ... and in our discussions, I beleive
we have both shown familiarity with the topic at hand, so ... if the words
of my posts have not changed your view on the normalization, or lack of
normalizaton, for VIEWS, then perhaps it is time to agree and say ... "good
discussion!" .. with respect of course!

See ya!
Brent Spaulding | datAdrenaline
 
J

Jamie Collins

I do not know about clustering behavior

See:

New Features in Microsoft Jet Version 3.0
http://support.microsoft.com/kb/137039

"New compacting method. Compacting the database now results in the
indices being stored in a clustered-index format. While the clustered
index isn't maintained until the next compact, performance is still
improved... The new clustered-key compact method is based on the
primary key of the table."

Defragment and Compact Database to Improve Performance
http://support.microsoft.com/kb/288631

"If a primary key exists in the table, compacting restores table
records into their primary key order. This provides the equivalent of
Non-maintained Clustered Indexes, and makes the read-ahead
capabilities of the Microsoft Jet database engine much more
efficient."
I can't make an informed decision about the air
pressure in my car's tires, either, because I don't understand the physics
involved. However, people who know about such things have suggested the
correct air pressure. I don't give it a lot of thought beyond that.

Here's the bare bones: you've have this thing (PRIMARY KEY) which
fulfils two main functions (null-free uniqueness and clustering
respectively) where one function sometimes compromises the other (a
sole unique column makes a poor cluster; multiple columns makes a
clustered compound index look like a verbose key) but you are also
given a second thing (NOT NULL UNIQUE) which performs one of the
functions (null-free uniqueness), which leaves the other function
(clustering) exclusive to the first thing (PRIMARY KEY).

I can't think of anything car tire/tyre pressure related that fits the
above so try this: Onedaywhen Cottage has central heating with zones,
each zone has a portable digita thermostat with integral clock. Mrs
Onedaywhen likes to keep the bedroom zone's thermostat on her bedside
table/nightstand because it makes a handy bedside clock. I know enough
to deduce she's chosen a lousy position for the thermostat e.g. the
heat from the bedside light sometimes causes the heating to switch off
prematurely. "Here," I say, "If you want to tell the time, use this
alarm clock. Then we can put the thermostat in a more appropriate
location." But we're at an impasse because people who know about such
things have suggested to her that the clock's timekeeping is accurate
by design because the boiler needs to know when to come on and she
does not understand the physics of printed circuit boards and oil-
fired boilers. But I keep telling her, "It's not just a clock, it's a
thermostat too. Use something else to tell the time." She replies, "I
do not know about thermostats." I've now given her some links to
articles on the manufacturer's website so the ball's in her court.
Again, I apply security so that the database cannot be changed outside of
the parameters I choose as the developer. The security can be circumvented
by a determined and knowledgeable user, but for my situation it is adequate.
It never occurred to me that somebody would take the Excel or some such
similar route, which sounds to me like a skilled and determined attempt to
undermine the data integrity. If the user can get through the security, no
approach to indexing is going to help. The answer, I'm afraid, is that I
don't see your point.

Again, I'm not referring to malicious attacks. If you give people
permissions on database object then in my experience sooner or later
someone will think it's a good idea to access those objects via Excel
e.g. to help them do their job better. This doesn't give them the
right to act irresponsibly, though. Perhaps they thought it would be
OK to correct some data directly in the table and are unaware you have
some data integrity correction code in one of your forms that has now
been inadvertently circumvented.

If I've been given a key to the stationary cupboard and I need to sign
some paper copy then I'll help myself to a pen. I still know that
having a key doesn't mean it's OK for me to go home today with ten
pens in my pocket. If you don't want me to have one of the pens, take
away my key; if you want to allocate pens yourself, keep them locked
in your desk drawer so that I *must* go through you to get a pen. It's
pretty clear that picking locks is illicit behaviour. But if I've used
my key to acquire a pen for my work, I don't see myself as having
successfully made a skilled and determined attempt to undermine your
stock control system -- I'm just trying to do my job.

Perhaps there is some middle ground e.g. you provide communal pens in
handy locations *but* keep them attached to chains, like they do in
banks, so that I don't do my usual trick of sticking it behind my ear
then forgetting and walking off with it.

Providing a SQL PROC or VIEW in the SQL DBMS is the provision of pens
(i.e. facilities me in doing my work without too much inconvenience),
and integrity constraints and data validation rules at the data engine
level are the chain (i.e. stops me messing up the system for my own
sake and others). Saying I have to always use your application's forms
is like saying I have to always use your pen, IMO too restrictive:
what if your pen has the wrong color/colour ink, doesn't work as well
as I like, you're not in the office today etc. And before you got
upset that I found a pen myself, had you mentioned that you are
considered to 'own' all the pens? If you did mention it, why do I have
a key to the pens cupboard?

Again, if you've found a way of granting privileges to the application
rather than to users (the latter being the most commonly seen
scenario) then please share your techniques, there is an Access MVP in
another thread wanting such information.
I want the
user to be informed by words of my choosing rather than default error
messages. I want to intercept the problem before then. The front end is
for the user's convenience, and I want it to be as convenient as is
possible.

Me too. Why not trap failures in the front end and in the data engine?
i.e. attempt to trap validation rules violations in the front end but
also have a validation rules in the data engine as a last line of
defence against things the front end guy missed, VBA code bugs, etc.
Errors generated by the engine can be caught so even when the front
end does encounter an 'unexpected' validation rule failure you can
trap the constraint name (or custom validation text, etc) to provide a
decent error message e.g. (pseudo code):

Private Sub btnOK_Click()
On Error Goto Err_Handle

' Do front end validation:
<<rules tests here, show message and exit on failure>>

' Attempt to write data
<<configure command here>>
On Error Resume Next
command.Execute
If Err.Number <> 0 Then
' Handle SQL errors here:
If InStr(Err.Description, "salary_periods_overlapping") > 0 Then
<<failure message for overlapping periods here>>
ElseIf InStr(Err.Description...
...
Else
<<handle unexpected SQL errors here>>
End If
End If
On Error Goto Err_Handle

<<clean up here>>
Exit Sub

Err_Handle:
Because there are no tables in the front end.

In the example, the SUVs table is not in the front end, yet you are
proposing make the test for vehicle_type='SUV' in the front end.

Some things are not so convenient to test in the front end and are a
lot easier to test with a database round trip e.g. duplicate entity
testing: easy to trap a unique key failure message but more work to
cache the entire table in memory to test for duplicates locally. Even
if testing for duplicates locally you wouldn't omit the unique
constraint from the database, so why omit the test vehicle_type='SUV'
from the SUVs table?
You know, I have actually tried to agree with you on some points.

Indeed. Did you want me to acknowledge that we're agree (or otherwise)
on certain points? I tend to concentrate efforts on the *unresolved*
points.

Jamie.

--
 
J

Jamie Collins

I am surprized you do not agree. A VIEW is a 'table' by SQL-92 definitions,
and a 'table' is 'a multiset of rows' by SQL-92 definition, and 'a multiset
of rows' is a 'resultset' ... admittedly the definition of a 'resultset' is
my interpretation based on context through out the thread. With that line
of thought, a VIEW is, by definition, a 'resultset', which does not need to
be normalized

Put another way, SQL has but one data structure, being the 'table'.
Yes ,you are correct here but I still need a term to differentiate
between the 'table' that a stored proc spits out and the 'table' that
a VIEW spits out because they *are* different even in SQL terms e.g.
you can do this:

SELECT ...
FROM MyBaseTable AS T1
INNER JOIN MyView AS T2
ON ...

and

INSERT INTO MyView (...) VALUES ...

but you can't do this:

SELECT ...
FROM MyBaseTable AS T1
INNER JOIN MyStoredProc AS T2
ON ...

nor

INSERT INTO MyStoredProc (...) VALUES ...

Can you suggest a term to differentiate a SQL table that can be used
in JOINs, can be INSERTed to etc? i.e. a collective term for base
tables and VIEWs? I tried 'virtual table' but I can see no basis for
this in the SQL-92 spec.

This difference in tables type is one of the criteria I apply (Ill
founded or no) when deciding whether a 'resultset table' is
implemented as a stored proc or a VIEW i.e. if the VIEW was a base
table for which someone could point to and say, "Your table is
denormalized" then I'd make it a stored proc.
If you do not agree to that, then, how 'bout this ... A VIEW is, by SQL-92
definition is a 'derived table' ... and 'derived tables' should not be
forced to follow normalizton guidlines.

I admit this sounds an interesting line of argument but on reflection
I cannot think of an example where a 'denormalized' structure would be
necessary in a derived table. Can you post a relevant example for
consideration? TIA.
"good
discussion!" .. with respect of course!

Agreed :)

Jamie.

--
 
B

BruceM

Jamie Collins said:
See:

New Features in Microsoft Jet Version 3.0
http://support.microsoft.com/kb/137039

"New compacting method. Compacting the database now results in the
indices being stored in a clustered-index format. While the clustered
index isn't maintained until the next compact, performance is still
improved... The new clustered-key compact method is based on the
primary key of the table."

Defragment and Compact Database to Improve Performance
http://support.microsoft.com/kb/288631

"If a primary key exists in the table, compacting restores table
records into their primary key order. This provides the equivalent of
Non-maintained Clustered Indexes, and makes the read-ahead
capabilities of the Microsoft Jet database engine much more
efficient."

I know that clustering exists. I just don't know what it means.
Here's the bare bones: you've have this thing (PRIMARY KEY) which
fulfils two main functions (null-free uniqueness and clustering
respectively) where one function sometimes compromises the other (a
sole unique column makes a poor cluster; multiple columns makes a
clustered compound index look like a verbose key) but you are also
given a second thing (NOT NULL UNIQUE) which performs one of the
functions (null-free uniqueness), which leaves the other function
(clustering) exclusive to the first thing (PRIMARY KEY).

I can't think of anything car tire/tyre pressure related that fits the
above so try this: Onedaywhen Cottage has central heating with zones,
each zone has a portable digita thermostat with integral clock. Mrs
Onedaywhen likes to keep the bedroom zone's thermostat on her bedside
table/nightstand because it makes a handy bedside clock. I know enough
to deduce she's chosen a lousy position for the thermostat e.g. the
heat from the bedside light sometimes causes the heating to switch off
prematurely. "Here," I say, "If you want to tell the time, use this
alarm clock. Then we can put the thermostat in a more appropriate
location." But we're at an impasse because people who know about such
things have suggested to her that the clock's timekeeping is accurate
by design because the boiler needs to know when to come on and she
does not understand the physics of printed circuit boards and oil-
fired boilers. But I keep telling her, "It's not just a clock, it's a
thermostat too. Use something else to tell the time." She replies, "I
do not know about thermostats." I've now given her some links to
articles on the manufacturer's website so the ball's in her court.

I can understand that heat (or lack of heat) causes a thermostat to switch
on (or off), and that something affecting the conditions in the thermostat's
immediate locale can cause the thermostat to perform other than as expected.
This is because I understand the purpose of a thermostat and the factors
that cause it to do its thermostat thing. When it comes to clustered
indexes I have no such frame of reference. You mention physical order on
the disk or something like that. I know enough about disks to know that
contiguous portions of the disk from the computer's point of view (I can't
think how else to put it) are not necessarily physically next to each other
(at least that's how it was a few years ago). This has something to do with
the disk's rotation speed and the fact that the read/write head can read
something on one side of the disk, and then a third of a rotation later read
the next thing, and so forth. However, if the next bit of information is
exactly next to the first one it's too soon for the read (the hardware can't
keep up), so the disk needs to rotate a full revolution and a bit more.
Knowledgeable people have pointed out that there are many performance
factors to be considered before physical ordering on the disk, and have
questioned whether such physical ordering is really relevant, at least in an
office-sized database. While I accept the validity of a multi-field index
(or several single-field indices), I don't accept the physical ordering
argument.
Again, I'm not referring to malicious attacks. If you give people
permissions on database object then in my experience sooner or later
someone will think it's a good idea to access those objects via Excel
e.g. to help them do their job better. This doesn't give them the
right to act irresponsibly, though. Perhaps they thought it would be
OK to correct some data directly in the table and are unaware you have
some data integrity correction code in one of your forms that has now
been inadvertently circumvented.

If they have the permissions of which you speak they are able to add
incorrect data that fulfill the table-level constraints. If anybody with a
little bit of skill can get into a secured database by way of Excel than I
am a poor developer in that I am utterly incapable of safeguarding the data.
Even if their intentions are the best, they should not be able to get into
the tables directly.
If I've been given a key to the stationary cupboard and I need to sign
some paper copy then I'll help myself to a pen. I still know that
having a key doesn't mean it's OK for me to go home today with ten
pens in my pocket. If you don't want me to have one of the pens, take
away my key; if you want to allocate pens yourself, keep them locked
in your desk drawer so that I *must* go through you to get a pen. It's
pretty clear that picking locks is illicit behaviour. But if I've used
my key to acquire a pen for my work, I don't see myself as having
successfully made a skilled and determined attempt to undermine your
stock control system -- I'm just trying to do my job.

Perhaps there is some middle ground e.g. you provide communal pens in
handy locations *but* keep them attached to chains, like they do in
banks, so that I don't do my usual trick of sticking it behind my ear
then forgetting and walking off with it.

Providing a SQL PROC or VIEW in the SQL DBMS is the provision of pens
(i.e. facilities me in doing my work without too much inconvenience),
and integrity constraints and data validation rules at the data engine
level are the chain (i.e. stops me messing up the system for my own
sake and others). Saying I have to always use your application's forms
is like saying I have to always use your pen, IMO too restrictive:
what if your pen has the wrong color/colour ink, doesn't work as well
as I like, you're not in the office today etc. And before you got
upset that I found a pen myself, had you mentioned that you are
considered to 'own' all the pens? If you did mention it, why do I have
a key to the pens cupboard?

Integrity constraints are one thing, but I don't know that my time would be
well spent devising what amount to backup validation rules at the table
level. If my front end does not stop data that violate data integrity rules
that are enforced at the table level I can accept the default error message.
The users will let me know.
Again, if you've found a way of granting privileges to the application
rather than to users (the latter being the most commonly seen
scenario) then please share your techniques, there is an Access MVP in
another thread wanting such information.

Actually, I grant permissions to groups, which is the most common scenario,
but is in the same category as granting permissions to individual users. I
don't know what you mean by "granting privileges to the application", but I
don't think I ever claimed I could or argued in favor of finding such an
approach.
Me too. Why not trap failures in the front end and in the data engine?
i.e. attempt to trap validation rules violations in the front end but
also have a validation rules in the data engine as a last line of
defence against things the front end guy missed, VBA code bugs, etc.
Errors generated by the engine can be caught so even when the front
end does encounter an 'unexpected' validation rule failure you can
trap the constraint name (or custom validation text, etc) to provide a
decent error message e.g. (pseudo code):

Private Sub btnOK_Click()
On Error Goto Err_Handle

' Do front end validation:
<<rules tests here, show message and exit on failure>>

' Attempt to write data
<<configure command here>>
On Error Resume Next
command.Execute
If Err.Number <> 0 Then
' Handle SQL errors here:
If InStr(Err.Description, "salary_periods_overlapping") > 0 Then
<<failure message for overlapping periods here>>
ElseIf InStr(Err.Description...
...
Else
<<handle unexpected SQL errors here>>
End If
End If
On Error Goto Err_Handle

<<clean up here>>
Exit Sub

Err_Handle:


In the example, the SUVs table is not in the front end, yet you are
proposing make the test for vehicle_type='SUV' in the front end.

NO!! When they select SUV the combo box for models, which gets its row
source from SQL, will be redefined or requeried as needed. If this fails
then the user is presented with a list of all models or the wrong models or
something. Engine-level constraints will not help the user, who may have no
relevant choices. I get what you are saying about how if I fail as a
developer and present the user with invalid choices, but the user makes one
of those choices anyhow, then there is nothing at the table level to stop
them. If the user selects Corolla as a type of SUV (because I have
presented the user with an invalid list of choices), you are correct that my
database design does not guard against that happening at the table level.
It is something to think about, now that I finally understand that you are
making a distinction between an index and a PK.
Some things are not so convenient to test in the front end and are a
lot easier to test with a database round trip e.g. duplicate entity
testing: easy to trap a unique key failure message but more work to
cache the entire table in memory to test for duplicates locally. Even
if testing for duplicates locally you wouldn't omit the unique
constraint from the database, so why omit the test vehicle_type='SUV'
from the SUVs table?


Indeed. Did you want me to acknowledge that we're agree (or otherwise)
on certain points? I tend to concentrate efforts on the *unresolved*
points.

You have got me to thinking about some things that I really hadn't
considered before. I have attempted to agree with you, only to have you
fire back as if I had never said anything. That gets old in a hurry. As I
have said, I use indexes on fields that I expect to be used for sorting,
filtering, and so forth. I have not given much thought to multi-field
indexes. For instance, if five fields are required I suppose I could
combine them into a multi-field non-null index, but what if 20 fields are
required? There are limitations to enforcing data integrity in this way.
 
J

Jamie Collins

Knowledgeable people have pointed out that there are many performance
factors to be considered before physical ordering on the disk, and have
questioned whether such physical ordering is really relevant, at least in an
office-sized database. While I accept the validity of a multi-field index
(or several single-field indices), I don't accept the physical ordering
argument.

Yes, there are many factors to consider as regards performance. Yes,
different people will rate these factors differently. But here's a
link to a MSDN article about query performance in an Access database
which places physical ordering at the top of its list of "Tips to
improve query performance":

Information about query performance in an Access database
http://support.microsoft.com/kb/209126

"To improve query performance, try these tips: Compact your database.
When you compact your database you can speed up queries. When you
compact your database, the records of the table are reorganized so
that the records reside in adjacent database pages that are ordered by
the primary key of the table. This improves the performance of the
sequential scans of records in the table because only the minimum
number of database pages now have to be read to retrieve the records
that you want."

Here is another a MSDN article which has a list of "Optimization Tips
and Techniques" and physical ordering is the second one listed:

Microsoft Jet 3.5 Performance Overview and Optimization Techniques
http://msdn.microsoft.com/archive/d...MicrosoftAccessMicrosoftJetDatabaseEngine.asp

"From a performance perspective, there are many reasons to frequently
compact a database. One reason is that compacting will create a new
database that stores all table rows in a contiguous order. If a
primary key or unique index is defined, the rows will be sorted in
order of the primary key or unique index. This allows Microsoft Jet to
take full advantage of its read-ahead cache and also reduces disk I/O
when doing sequential scans of a table."

Out of interest, the first one listed is "Prevent Unnecessary Query
Recompilation" and seems to relate only to stored queries and then
only in conjunction with DAO's infamous memory leak bug (which I
understand has since been fixed).

In case it's not clear what these articles are getting at, here are a
few aircode examples of 'sequential scans' that will favour physical
ordering:

SELECT *
FROM TelephoneDirectory
WHERE last_name = 'Collins';

SELECT *
FROM SalaryHistory
WHERE employee_number = '454818181'
AND start_date
BETWEEN #2006-01-01 00:00:00# AND
AND #2006-12-31 23:59:59#;

SELECT employee_number,
COUNT(*) AS salary_review_tally
FROM SalaryHistory
GROUP BY employee_number;

When was the last time you used BETWEEN or GROUP BY on an autonumber
column?

Now, be fair, where are these knowledgeable people's lists and where
exactly does physical ordering appear in their lists? If they've
omitted physical ordering from their lists, how do we know they've
even considered it? TIA.
I grant permissions to groups, which is the most common scenario,
but is in the same category as granting permissions to individual users.

OK so you give people the permissions...
Even if their intentions are the best, they should not be able to get into
the tables directly.

Then why give you give them (via their groups) the permissions to do
so?
If they have the permissions of which you speak ...

... you just confirmed that they do...
...they are able to add
incorrect data that fulfill the table-level constraints.

So why not conclude, "I'd better put in some table-level constraints
to prevent incorrect data"?
Integrity constraints are one thing, but I don't know that my time would be
well spent devising what amount to backup validation rules at the table
level.

Funny, I start with the table-level constraints first (i.e. they are
not back ups) then sigh when I think about having to duplicate the
effort in the front end <g>.

You design your tables *before* you design your forms, right? Why not
the same for constraints?
I have attempted to agree with you, only to have you
fire back as if I had never said anything. That gets old in a hurry.

Oops, sincere apologies. Please believe I don't do that on purpose. I
think it must happen only when I've missed your point (or I think
you've missed my point when you haven't etc).

Jamie.

--
 
B

BruceM

Jamie Collins said:
Yes, there are many factors to consider as regards performance. Yes,
different people will rate these factors differently. But here's a
link to a MSDN article about query performance in an Access database
which places physical ordering at the top of its list of "Tips to
improve query performance":

Information about query performance in an Access database
http://support.microsoft.com/kb/209126
"To improve query performance, try these tips: Compact your database.
When you compact your database you can speed up queries. When you
compact your database, the records of the table are reorganized so
that the records reside in adjacent database pages that are ordered by
the primary key of the table. This improves the performance of the
sequential scans of records in the table because only the minimum
number of database pages now have to be read to retrieve the records
that you want."

I never argued against compacting.
Here is another a MSDN article which has a list of "Optimization Tips
and Techniques" and physical ordering is the second one listed:

Microsoft Jet 3.5 Performance Overview and Optimization Techniques
http://msdn.microsoft.com/archive/d...MicrosoftAccessMicrosoftJetDatabaseEngine.asp

"From a performance perspective, there are many reasons to frequently
compact a database. One reason is that compacting will create a new
database that stores all table rows in a contiguous order. If a
primary key or unique index is defined, the rows will be sorted in
order of the primary key or unique index. This allows Microsoft Jet to
take full advantage of its read-ahead cache and also reduces disk I/O
when doing sequential scans of a table."

Out of interest, the first one listed is "Prevent Unnecessary Query
Recompilation" and seems to relate only to stored queries and then
only in conjunction with DAO's infamous memory leak bug (which I
understand has since been fixed).

There's also something about judicious use of indexes, and something about
the use of expressions in subqueries. The latter includes the
recommendation that expressions be used in the front end (i.e. the form) to
produce the necessary information in subforms. Anyhow, it looks like a good
article. I have bookmarked it.
In case it's not clear what these articles are getting at, here are a
few aircode examples of 'sequential scans' that will favour physical
ordering:

SELECT *
FROM TelephoneDirectory
WHERE last_name = 'Collins';

SELECT *
FROM SalaryHistory
WHERE employee_number = '454818181'
AND start_date
BETWEEN #2006-01-01 00:00:00# AND
AND #2006-12-31 23:59:59#;

SELECT employee_number,
COUNT(*) AS salary_review_tally
FROM SalaryHistory
GROUP BY employee_number;

When was the last time you used BETWEEN or GROUP BY on an autonumber
column?

Gosh, I can't remember when I last did that. But what a probing and
thought-provoking question!! <sarcasm>
I have been saying the same thing over and over, and will stop doing so
after this message. I do not rely on autonumber to make a record unique. I
use autonumber when no "natural" field meets my needs for an Access PK.
You have not answered my objection to linking tables through fields that are
subject to change.
Now, be fair, where are these knowledgeable people's lists and where
exactly does physical ordering appear in their lists? If they've
omitted physical ordering from their lists, how do we know they've
even considered it? TIA.

I have seen the stuff about compacting. No argument. Once again, I am not
arguing against compacting. That is to say, I think compacting is a good
thing, and I have observed performance differences after compacting.
However, I have seen information about how indexing choices can degrade
performance (for instance, when using indexes on City and PostalCode
fields). As the article points out, there is no single correct answer about
indexing.
OK so you give people the permissions...


Then why give you give them (via their groups) the permissions to do
so?

I don't. I never said I did. Unless you know some sort of back door (or
are a member of the appropriate group), you would not be able to open the
back end directly.
.. you just confirmed that they do...

Not to access the tables directly.
So why not conclude, "I'd better put in some table-level constraints
to prevent incorrect data"?

I wouldn't know how to limit at the table level the list of models if "SUV"
is chosen as the class of vehicle. So far I see no reason to learn.
Funny, I start with the table-level constraints first (i.e. they are
not back ups) then sigh when I think about having to duplicate the
effort in the front end <g>.

OK, fine.
You design your tables *before* you design your forms, right? Why not
the same for constraints?

Of course I design the tables (and relationships) first. However, I have
changed elements of the table design after development of forms, etc. is
well under way. I have also gone back and re-thought the design. You?
 
J

Jamie Collins

I have been saying the same thing over and over, and will stop doing so
after this message. I do not rely on autonumber to make a record unique.

I've heard you say this over and over and I believe you. Good for you.
Neither do I. I'm glad we agree. Actually, I think we agree on most
things. BTW no sarcasm intended.
You have not answered my objection to linking tables through fields that are
subject to change.

I've not tried to *refute* your objection because I think you have
every right to hold that view. I cannot tell you whether joining your
tables on fields that are subject to change is good or bad or
indifferent because I don't know anything about their structure or
content (but, sight unseen, my money would be on 'indifferent').

However, I feel I have *answered* your objection e.g. on 16 August in
this thread when I said, "I think the SQL DBMS should store the key
value only once and use an internal surrogate." We both agree that
physically storing the key value in all the referencing tables is a
Bad Thing but we have different responses to it. My response is to
shrug: it's a 'feature' of the product and if I thought the product's
performance, CASCADE referential actions, etc were not up to the job
then the product is not fit for purpose and it's a free market. Your
response is to roll your own surrogate using autonumber. I don't feel
I've tried to push my approach down your throat though I have
suggested you to consider the effects of placing a *physical* index on
an autonumber column and given some examples of how a sole autonumber
PK for referencing in FKs can work against some design patterns such
as subclassing. I've acknowledged that your way valid but is not for
me, primarily because it makes referencing tables' data harder to
read, but have wished you luck with yours.

In the same spirit, may I press you further for detail of,
"Knowledgeable people [who] have pointed out that there are many
performance factors to be considered before physical ordering on the
disk," please? In my experience, most supposedly knowledgeable people
give it no consideration; others don't seem to believe that clustering
actually happens in Jet at all!
I think compacting is a good
thing, and I have observed performance differences after compacting.
However, I have seen information about how indexing choices can degrade
performance (for instance, when using indexes on City and PostalCode
fields). As the article points out, there is no single correct answer about
indexing.

Again, we agree there are many factors to performance. You said you'd
heard physical ordering was not at the top of some lists. I've shown
two from Microsoft where it is.

One thing I'm still not clear on. You said earlier, "I don't accept
the physical ordering argument." I thought you meant you did not see
how it can be relevant to either choice of PK or performance
implications. Did I misunderstand what you meant? Has anything I've
presented caused you to revise this earlier held view? I get how you
choose your PKs (unchanging field values, single column FKs) but do
you get how physical ordering might affect PK choice for someone else?
I don't. I never said I did.

It seems I'm not understanding your approach. I see two basic
approaches:

1) Users can perform updates to your tables' data via your application
because you have granted the privileges on the tables to the users
(but does not preclude the same users from performing updates to your
tables' data via another route).

2) You have not granted the privileges on the tables to the users but
they can still perform updates to your tables' data via your
application because you have granted privileges on the application to
the users and granted privileges on the tables to your application
(the advantage here is that they can *only* perform updates via your
application).

Is your approach one of these, a variation or something completely
different? Note a common third approach is that everyone runs as an
administrator but that would not seem to apply to you.

Jamie.

--
 
B

BruceM

Jamie Collins said:
I've heard you say this over and over and I believe you. Good for you.
Neither do I. I'm glad we agree. Actually, I think we agree on most
things. BTW no sarcasm intended.


I've not tried to *refute* your objection because I think you have
every right to hold that view. I cannot tell you whether joining your
tables on fields that are subject to change is good or bad or
indifferent because I don't know anything about their structure or
content (but, sight unseen, my money would be on 'indifferent').

However, I feel I have *answered* your objection e.g. on 16 August in
this thread when I said, "I think the SQL DBMS should store the key
value only once and use an internal surrogate." We both agree that
physically storing the key value in all the referencing tables is a
Bad Thing but we have different responses to it. My response is to
shrug: it's a 'feature' of the product and if I thought the product's
performance, CASCADE referential actions, etc were not up to the job
then the product is not fit for purpose and it's a free market. Your
response is to roll your own surrogate using autonumber. I don't feel
I've tried to push my approach down your throat though I have
suggested you to consider the effects of placing a *physical* index on
an autonumber column and given some examples of how a sole autonumber
PK for referencing in FKs can work against some design patterns such
as subclassing. I've acknowledged that your way valid but is not for
me, primarily because it makes referencing tables' data harder to
read, but have wished you luck with yours.

I thought a PK field is indexed, no duplicates by default (and without
options to change those aspects).
In the same spirit, may I press you further for detail of,
"Knowledgeable people [who] have pointed out that there are many
performance factors to be considered before physical ordering on the
disk," please? In my experience, most supposedly knowledgeable people
give it no consideration; others don't seem to believe that clustering
actually happens in Jet at all!

In this listing physical ordering is not listed directly, but at least one
of the referenced articles discusses indexing. I don't know how indexing
and physical ordering on the disk are related.


In previous discussions several people who have demonstrated a good command
of Access and of databases in general questioned the importance of clustered
indexes for physical ordering on the disk.
http://groups.google.com/group/micr...19f15?lnk=st&q=&rnum=1&hl=en#a592cf0d56a19f15

http://groups.google.com/group/micr...access"+collins&rnum=3&hl=en#3427209cee5d14d7

There is a variety of views within those threads.

By the way, if there is a better way to reference newsgroup discussions I
would be glad to know about it.
Again, we agree there are many factors to performance. You said you'd
heard physical ordering was not at the top of some lists. I've shown
two from Microsoft where it is.

If the listings are in order of importance, then I guess you are correct.
As I have stated several times already, I finally understand (I think) a
distinction you made between PKs and indexes.

I still doubt there is a performance issue between indexing on an arbitrary
number such as EmployeeID and and arbitrary PK. Also, I may choose to list
employees by lastname, firstname, but have many occasions to order or group
by department, startdate, or what have you.
BTW, one of the articles states that the rows will be ordered according to
the primary key OR the unique index, but leaves me guessing as to what
happens if both exist.
One thing I'm still not clear on. You said earlier, "I don't accept
the physical ordering argument." I thought you meant you did not see
how it can be relevant to either choice of PK or performance
implications. Did I misunderstand what you meant? Has anything I've
presented caused you to revise this earlier held view? I get how you
choose your PKs (unchanging field values, single column FKs) but do
you get how physical ordering might affect PK choice for someone else?

I probably did not accept the argument because I may order on one of several
fields, depending on how I wish to present the data. If one of those fields
is the unique index, but it is not the most common ordering I will use, I
wonder if there is an advantage.
It seems I'm not understanding your approach. I see two basic
approaches:

1) Users can perform updates to your tables' data via your application
because you have granted the privileges on the tables to the users
(but does not preclude the same users from performing updates to your
tables' data via another route).

2) You have not granted the privileges on the tables to the users but
they can still perform updates to your tables' data via your
application because you have granted privileges on the application to
the users and granted privileges on the tables to your application
(the advantage here is that they can *only* perform updates via your
application).

As far as I know, properly implemented user-level security prevents
back-door updates to the tables. I do know that there is no direct way of
opening the back end (and getting to the table data that way) without going
through the secure mdw file (although I am aware that user-level security is
not as robust as other methods available to SQL). What you described with
the Excel thing is in fact a back-door update, undertaken by somebody who is
deliberately and knowingly circumventing the intended route for entering
data. In my book that is hacking, not an innocent attempt to be more
efficient.
Users have full permissions to the table links in the FE, and limited
permissions (according to the group of which they are a member) to the
actual BE tables.
All updates are intended to happen by way of the user interface. As of now
users can still use the Shift key bypass, but there are ways to prevent
that. I haven't had the chance to implement that yet, as other matters have
my immediate attention, but plan to do so before long. As it is, unsecured
databases have been unmolested for quite some time, and very few people even
know about the Shift bypass option, so I'm not too concerned in the
immediate short term.
 
J

Jamie Collins

In previous discussions several people who have demonstrated a good command
of Access and of databases in general questioned the importance of clustered
indexes for physical ordering on the disk.
http://groups.google.com/group/microsoft.public.access/browse_thread/...

There's 101 posts in that thread and you linked to one which I wrote!
Who do you think is the knowledgeable person in this thread? I recall
David W Fenton being particularly obstructive telling he knew of lots
of things but wouldn't deign to reveal them and ironically did bother
[David W Fenton] Given that with Jet you can only have the one clustered index, I
really think this is a pretty irrelevant consideration. I never use
the BETWEEN operator on anything but date fields, which in the vast
majority of tables could not possibly ever be a candidate for PK
(and very seldom even a candidate for inclusion in a compound
natural key, which wouldn't give you the clustered index benefit,
anyway, unless the date was the first field of the compound key).

So, I just don't see any practical benefit in your pointing out the
performance advantage of the clustered index.

You can only have one clustered in index period -- it's not a Jet
restriction, more like the laws of physics <g>! In *this* thread I
posted the salary history example where dates are part of the
candidate keys and we see history table designs cropping up regularly
in the Access groups. And he answered his own point about including
such a date in a compound key i.e. make it the first column in the
key's definition. I generally get the feeling he was looking at things
the wrong way to be able to see my point i.e. he was considering
considers the clustering resulting from his existing choice of PK,
whereas I'm proposing one identifies the columns for clustering then
choose the PK to fit. There are some later exchanges but he seems to
dismiss the whole idea ("I think your suggestion is bloody stupid", "I
think you're mis-using your RDBMS") without giving solid reasons.

That one's you and I again! And the only other person who commented on
clustering is David W Fenton again but this time he's actually
acknowledged the point:
[David W Fenton] I can't imagine a telephone number table that would *need* a PK that
didn't involve the parent foreign key.

In any event, you'd have to have hundreds of thousands of records in
the telephone table for it to make a noticeable difference.

Again, he seemingly hasn't made the shift to, "What if you chose the
PK to purposely change the clustering...?" But he does seem to see the
light. While I'm not sure he's correct when he says "hundreds of
thousands of records" (the MSFT articles don't hint at this), tables
of such proportions are hardly a rare occurrence in the Access
groups.

One thing that bothers me is there is no frame of reference in this
thread e.g. David W Fenton does not give any comparison to other
optimization techniques. The Microsoft articles rated it highly, after
all.
There is a variety of views within those threads.

Indeed. Notice how a few people needed convincing that clustering
existed in Jet at all, after which they dropped out of the discussion.
I still doubt there is a performance issue between indexing on an arbitrary
number such as EmployeeID and and arbitrary PK.

Agreed. I can't see either would offer any advantage. Clustering is
better suited to meaningful data.
Also, I may choose to list
employees by lastname, firstname, but have many occasions to order or group
by department, startdate, or what have you.

You then have to look at the relative usage frequency, importance,
etc. I didn't say optimization was easy said:
BTW, one of the articles states that the rows will be ordered according to
the primary key OR the unique index, but leaves me guessing as to what
happens if both exist.

PK takes precedence, as advised by Lyle Fairfield in the first thread
you linked to.
I do know that there is no direct way of
opening the back end (and getting to the table data that way) without going
through the secure mdw file

Users have full permissions to the table links in the FE, and limited
permissions (according to the group of which they are a member) to the
actual BE tables.

What you described with
the Excel thing is in fact a back-door update, undertaken by somebody who is
deliberately and knowingly circumventing the intended route for entering
data. In my book that is hacking, not an innocent attempt to be more
efficient.

If they've used their own credentials via your mdw (i.e. utilizing
your security mesures rather than circuventing them) to connect to the
data in the BE from within Excel then in my book that is legitimate
usage and IMO should be accommodated.

Jamie.

--
 
B

BruceM

Jamie Collins said:
There's 101 posts in that thread and you linked to one which I wrote!

I was trying to point to the thread, not to a specific message. Albert
argued that there is no reason not to use PK. Presumably if a
non-maintained clustered index was important that would have been worth
mentioning. He did say that there are perfectly valid reasons to use a
natural key, too.
Amy had some observations on the subject, including regarding it as silly
(or some such) that a PK (or non-null unique index) would be chosen for
clustering reasons.
Who do you think is the knowledgeable person in this thread? I recall
David W Fenton being particularly obstructive telling he knew of lots
of things but wouldn't deign to reveal them and ironically did bother
[David W Fenton] Given that with Jet you can only have the one
clustered index, I
really think this is a pretty irrelevant consideration. I never use
the BETWEEN operator on anything but date fields, which in the vast
majority of tables could not possibly ever be a candidate for PK
(and very seldom even a candidate for inclusion in a compound
natural key, which wouldn't give you the clustered index benefit,
anyway, unless the date was the first field of the compound key).

So, I just don't see any practical benefit in your pointing out the
performance advantage of the clustered index.

You can only have one clustered in index period -- it's not a Jet
restriction, more like the laws of physics <g>! In *this* thread I
posted the salary history example where dates are part of the
candidate keys and we see history table designs cropping up regularly
in the Access groups. And he answered his own point about including
such a date in a compound key i.e. make it the first column in the
key's definition. I generally get the feeling he was looking at things
the wrong way to be able to see my point i.e. he was considering
considers the clustering resulting from his existing choice of PK,
whereas I'm proposing one identifies the columns for clustering then
choose the PK to fit. There are some later exchanges but he seems to
dismiss the whole idea ("I think your suggestion is bloody stupid", "I
think you're mis-using your RDBMS") without giving solid reasons.

That one's you and I again! And the only other person who commented on
clustering is David W Fenton again but this time he's actually
acknowledged the point:
One thing that bothers me is there is no frame of reference in this
thread e.g. David W Fenton does not give any comparison to other
optimization techniques. The Microsoft articles rated it highly, after
all.


Indeed. Notice how a few people needed convincing that clustering
existed in Jet at all, after which they dropped out of the discussion.

I guess that makes you the winner. After this discussion you can add me to
the list of the vanquished, because I don't feel like extending this much
longer.
Agreed. I can't see either would offer any advantage. Clustering is
better suited to meaningful data.


You then have to look at the relative usage frequency, importance,


PK takes precedence, as advised by Lyle Fairfield in the first thread
you linked to.


If they've used their own credentials via your mdw (i.e. utilizing
your security mesures rather than circuventing them) to connect to the
data in the BE from within Excel then in my book that is legitimate
usage and IMO should be accommodated.

Is there documentation that what you describe is possible?
 
J

Jamie Collins

Is there documentation that what you describe is possible?

You serious? Try these KB articles:

ACC97: How to Use IDC Files to Query a Secure Microsoft Access
Database
http://support.microsoft.com/kb/161172

How to open a secured Access database in ADO through OLE DB
http://support.microsoft.com/kb/191754

How To Open an Access Database with User Security [using DAO]
http://support.microsoft.com/kb/163002

How to open a secured Microsoft Access database in Visual C++
http://support.microsoft.com/kb/270663

HOW TO: Connect to a Security-Enhanced Microsoft Access Database by
Using Microsoft Visual Studio .NET
http://support.microsoft.com/kb/823927

How To Use ADO to Refresh/Create Linked Table for Password Secured Jet
4.0 Database
http://support.microsoft.com/kb/240222

etc etc

Try these simple steps: Open Excel 2003. On the menu, choose: Data,
Import External Data, New Database Query. In the resulting 'Choose
Data Source' dialog, select <New Data Source>, and click OK. In the
resulting, 'Create New Data Source' dialog, step 1: enter a name (for
the DSN), step 2: choose Microsoft Access Driver, step 3: click the
'Connect' button. In the resulting 'ODBC Microsoft Access Setup'
dialog, choose you mdb 'Database' (I recommend you use a copy here),
choose your mdw 'System Database', click the 'Advanced' button and
enter a valid username and password combination. OK to each dialog to
get you back to the 'Choose Data Source' dialog, select the newly-
created DSN and hit OK to take you to the main MSQuery dialog (cancel
the 'Add tables' dialog if it shows on start up). Hit the SQL button,
type in a Jet syntax SQL statement (e.g. CREATE VIEW DropMe AS SELECT
1;), click the OK button, continue past the odd 'query cannot be
displayed graphically' message and the SQL will be executed.

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top