Too much integrity?

Phil · Sep 19, 2007

I'm trying to construct a DB that's essentially a list of lists, with
each sublist having an identical structure. Seems straightforward,
right? Make a single table for ALL the items, to keep the list
structure consistent; include a field to say which list each item
belongs to; and make that part of a composite PK because the item
numbers are only unique across the list they belong to. And, for
completion's sake, create a table of lists (which I might use for things
like the list titles), and make the aforementioned field FK to it.

The problem is, an item can deserve a place in more than one list, and
updates to an item should be reflected to every list it appears in
simultaneously. It's not MtM, because each place in a given list can
only be occupied by one item, but I created a separate "places" table
anyway, since it'd be wasteful to create fields in the items table that
would invalidate the rest of the record if filled.

The places table contains four fields: a list/item pair for a given
place on a list, and a list/item pair specifying its "real" location in
the items table (that is, where it was originally added; for the first
(or only) reference to an item, these two pairs of fields would be
identical).

Next, I added more relations to enforce referential integrity: A
place's list must exist in the lists table; an item must be used at
least once as a "real" location in the places table (or it has no reason
to exist); and an item must also be used EXACTLY once as an "apparent"
location in the places table. (If an item is used in one or more other
places and then removed from its original place, I have to write code to
renumber the item to match one of the remaining places anyway, because
its old place may need to be reused for some other item.)

Unfortunately, this works too well - now I can't add any items because
"a related record is required in table 'places'", but I can't add any
places because "a related record is required in table 'items'"! What
can I do to fix this, without losing the necessary constraints?

Pat Hartman \(MVP\) · Sep 19, 2007

You are trying to replicate a "paper" list. Relational databases aren't
intended to do that. If something exists in multiple places, the
relationship IS m-m regardless of how you have tried to implement it. The
bi-directional constraint implies a m-m relationship. Perhaps you wouldn't
need the "filler" rows if you used left joins when joining to this table.

I have a similar table structure that I use for maintaining lists for
populating comboboxes. Rather than creating separate tables and related
maintenance forms/reports or hard-coding embedded lists, I use a single
table with a form/subform for user interface and reports that can print any
or all of the lists. I create a separate querydef for each unique "list"
and use that query as the RowSource for a combo. I have implemented this
structure in COBOL with both IMS/DB and DB2 back ends and in Access with a
variety of RDBMS back ends. In my case, the relationship truly is 1-m. An
item appears in one and only one list.

Jamie Collins · Sep 20, 2007

Make a single table for ALL the items

an item can deserve a place in more than one list, and
updates to an item should be reflected to every list it appears in
simultaneously. It's not MtM, because each place in a given list can
only be occupied by one item

....Agreed, it sounds like multiple 1:1 relationships between the
'items' and the specialist lists' items...

include a field [in the table of for ALL the items] to say which list each item
belongs to; and make that part of a composite PK because the item
numbers are only unique across the list they belong to.

....and I think this may be where you are going wrong.

Consider the following example which uses item name as a key:

CREATE TABLE Items (
item_name VARCHAR(30) NOT NULL UNIQUE
)
;
CREATE TABLE ListGreatArtworks (
item_name VARCHAR(30) NOT NULL UNIQUE
REFERENCES Items (item_name)
ON DELETE CASCADE
ON UPDATE CASCADE
)
;
CREATE TABLE ListIconicDesign (
item_name VARCHAR(30) NOT NULL UNIQUE
REFERENCES Items (item_name)
ON DELETE CASCADE
ON UPDATE CASCADE
)
;
INSERT INTO Items (item_name)
VALUES ('Xanadu the Broadway show')
;
INSERT INTO Items (item_name)
VALUES ('Soap on a rope')
;
INSERT INTO ListGreatArtworks (item_name)
VALUES ('Xanadu the Broadway show')
;
INSERT INTO ListIconicDesign (item_name)
VALUES ('Xanadu the Broadway show')
;
INSERT INTO ListIconicDesign (item_name)
VALUES ('Soap on a rope')
;

You want IMO the items details in only one place so that changes are
to avoid 'update anomalies' e.g. to correct the above deliberate
mistake (<g>) requires just one SQL update:

UPDATE Items
SET 'Xanadu the movie'
WHERE item_name = 'Xanadu the Broadway show';

Put another way, there is no need for a field in the Items table to
indicate which lists they each appears in; that's what the list tables
are for!

It appears to me that you may have a need for 'subclassing' where the
item's *type* appears in the Items table (superclass table) in order
to control its use in list tables (subclass tables) and beyond e.g.

CREATE TABLE Items
(
item_name VARCHAR(30) NOT NULL,
item_type VARCHAR(20) NOT NULL,
CHECK (item_type IN ('Movie', 'Broadway show', 'Toiletry')),
UNIQUE (item_type, item_name)
)
;
CREATE TABLE Movies
(
item_name VARCHAR(30) NOT NULL UNIQUE,
item_type VARCHAR(20) NOT NULL,
CHECK (item_type = 'Movie'),
FOREIGN KEY (item_type, item_name)
REFERENCES Items (item_type, item_name),
aspect_ratio DECIMAL(3, 2) NOT NULL
)
;
CREATE TABLE Broadwayshows
(
item_name VARCHAR(30) NOT NULL UNIQUE,
item_type VARCHAR(20) NOT NULL,
CHECK (item_type = 'Broadway show'),
FOREIGN KEY (item_type, item_name)
REFERENCES Items (item_type, item_name),
on_tour CHAR(1) DEFAULT 'N' NOT NULL
)
;
CREATE TABLE ListGreatMovies (
item_name VARCHAR(30) NOT NULL UNIQUE
REFERENCES Movies (item_name)
)
;
CREATE TABLE ThatsEntertainment (
item_name VARCHAR(30) NOT NULL,
item_type VARCHAR(20) NOT NULL,
CHECK (item_type IN ('Movie', 'Broadway show')),
UNIQUE (item_type, item_name),
FOREIGN KEY (item_type, item_name)
REFERENCES Items (item_type, item_name)
)
;

I can't add any items because
"a related record is required in table 'places'", but I can't add any
places because "a related record is required in table 'items'"! What
can I do to fix this, without losing the necessary constraints?

Click to expand...

You may be wondering why my latter example has no integrity constraint
to ensure that every item in the Items table with an item_type
attribute of 'Movie' has a corresponding row in the Movies table.
While it is possible to *define* such a constraint, it makes it
impossible to use in practice (as you have discovered).

The underlying problem is that the SQL language has no concept of
multiple assignment and although the SQL-92 standard includes
deferrable constraints Jet does not support them. The best I can come
up with is to create a Jet VIEW (non-parameterized Access Query object
defined using a SELECT query) for each subclass table which JOINs to
the superclass table e.g.

CREATE VIEW MovieItems
AS
SELECT T1.item_name AS T1_item_name, T1.item_type AS T1_item_type,
T2.item_name AS T2_item_name, T2.item_type AS T2_item_type,
T2.aspect_ratio
FROM Items AS T1
INNER JOIN Movies AS T2
ON T1.item_name = T2.item_name
AND T1.item_type = T2.item_type;

Inserting into such a VIEW successfully creates rows in both tables
and satisfies the timing of the constraint checking (i.e. you don't
get a "a related record is required" error).

The repetitive nature of the columns in the VIEW is necessary but
doesn't make it user friendly, therefore I'd suggest simplifying using
a SQL PROCEDURE (Access Query object defined using a either a SQL
statement -- INSERT/UPDATE/DELETE etc -- or a parameterized SELECT
query) e.g.

CREATE PROCEDURE AddMovieItem
(
arg_item_name VARCHAR(30),
arg_aspect_ratio DECIMAL(3,2) = 1.85
)
AS
INSERT INTO MovieItems (T1_item_name, T1_item_type, T2_item_name,
T2_item_type, aspect_ratio)
SELECT DISTINCT arg_item_name, 'Movie', arg_item_name, 'Movie',
arg_aspect_ratio
FROM Calendar;

Usage:

EXECUTE AddMovieItem 'Xanadu';

Privileges can now be used to control adding such items via the PROC
rather than the tables. While this is not as satisfactory IMO as a
constraint, it achieves the same goal (i.e. ensures bad data cannot be
created) and is anyhow more proactive than a constraint alone (i.e.
facilitates the creation of good data).

PS Great title for a post

Jamie.

--

Phil · Sep 20, 2007

Pat said:
You are trying to replicate a "paper" list. Relational databases aren't
intended to do that.

I understand that. I'm trying to do this as a database specifically to
eliminate having to track down all the places an item might appear and
update them separately; however, even the best designed databases do
have to have their data represented on a paper report once in a while...

If something exists in multiple places, the
relationship IS m-m regardless of how you have tried to implement it. The
bi-directional constraint implies a m-m relationship.

Okay, I think I get this; it's m2m because a list can contain multiple
items and an item can appear on multiple lists. I have to stop trying
to think of the "places" table as a real world entity and start thinking
of it as merely a means of establishing that m2m relationship.

Perhaps you wouldn't
need the "filler" rows if you used left joins when joining to this table.

What filler rows are you referring to? The original reason I created
the "places" table was to make it so I wouldn't need item records that
were mostly blank - or are you talking about something else?

I also don't understand how a left (or right) join would help matters
any. In all the cases I can think of, I'm dealing with subsets of a
table - either the subset of items that belong on a given list, or the
subset of lists that a given item is on. In no case should a list have
zero items; in no case should an item appear on zero lists.

I have a similar table structure that I use for maintaining lists for
populating comboboxes.

I've worked with similar structures in the past - but I think I got you
too narrowly focused on combo boxes and the like by calling my tables
"lists" and "items". I called them that here to try to NOT lock anyone
into reading a specific purpose into them, so the people seeing this
wouldn't say, "Oh, I've never made a DB that does foo to bar, so I won't
reply because my experience probably wouldn't apply."

Still, none of this seems to address the problem I came here for help with:

To which I'll now add: Are all the constraints I created really necessary?

Pat Hartman \(MVP\) · Sep 20, 2007

You need to recreate the schema so that it supports a many-to-many
relationship. That will solve your constraint issue. The constraints can't
be bi-directional or you'll never be able to add new parents to a table
since its children don't exist. If you think of it in those terms, the
relationship being bi-directional doesn't make any sense. How can you have
children without first having parents?

The many-to-many relationship may be sparse. That means that a relation
record only exists in the relationship if there is something to store in it.
A real world example of this would be classes and students. A student has
many classes and a class has many students. But, a student has only one
final grade for a class so the relationship table contains a compound
primary key along with data dependent on the relationship. ClassID,
StudentID, and SemesterID make up the compound PK and GPA is the data.
Additional pieces of data may exist such as comments, etc.

To query against this relationship, you would probably use a Left join if
you wanted a list of all students in the class regardless of whether a final
grade has been assigned.

Does this make any sense regarding your relationship?

FYI, it doesn't do you any good to shelter us from your real application,
especially for the reason given. If someone cannot recognize a correlation
between something they have done and something you are doing, you probably
don't want to take advice from them.

Jamie Collins · Sep 21, 2007

You need to recreate the schema so that it supports a many-to-many
relationship. That will solve your constraint issue. The constraints can't
be bi-directional or you'll never be able to add new parents to a table
since its children don't exist. If you think of it in those terms, the
relationship being bi-directional doesn't make any sense. How can you have
children without first having parents?

"Parent" and "child" are the wrong terms here, IMO (in SQL DBMS, the
equivalent terms are 'referenced' and 'referencing' respectively).
Consider the subclassing design pattern e.g. superclass table Vehicles
and subclass tables for SUV, Sedans, etc. These represent multiple one-
to-one relationships with no implied parent child relationship e.g. an
SUV is a vehicle and both 'Vehicle' and 'SUV' elements come into
existence at the same time. I could be wrong but this model seems to
fit the OP's scenario better than your "parent" and "child" analogy. I
think it's reasonable for the OP to expect Jet to have a constraint to
ensure, for example, that every row in a superclass table has exactly
one corresponding row among a number of subclass tables (Hugh Darwen
coined the term "distributed foreign key" for this) and for data to
come into existence in multiple tables at the same time (Chris Date
coined the term "multiple assignment" for this) but AFAIK the "proc
and privileges" approach (I coined that term <g>) I describe elsewhere
in this thread is the closest one can get using Jet.

Jamie.

--

Pat Hartman \(MVP\) · Sep 21, 2007

SUV, Sedan, etc. is an attribute of Vehicle. Body types (car models) can
exist as a lookup value without being physically represented by a specific
vehicle. However, a vehicle requires a body type.

Phil · Sep 23, 2007

Jamie said:
...Agreed, it sounds like multiple 1:1 relationships between the
'items' and the specialist lists' items...

Actually, Pat's earlier message convinced me that what I have IS an m2m
relationship, only I wasn't seeing it that way.

[...] CREATE TABLE ListGreatArtworks ( [...]
[...] CREATE TABLE ListIconicDesign ( [...]
[...] aspect_ratio DECIMAL(3, 2) NOT NULL [...]
[...] on_tour CHAR(1) DEFAULT 'N' NOT NULL [...]
[...] CREATE TABLE ThatsEntertainment ( [...]

No, I'm not trying to be that fancy. I said in the first post that
there are no differences in the item structure across lists, therefore
there's only one table holding ALL the items, no matter which list(s)
they're on.

The underlying problem is that the SQL language has no concept of
multiple assignment and although the SQL-92 standard includes
deferrable constraints Jet does not support them.

I'm not 100% sure what that means, but I'm pretty sure I could work
around my problem with transactions - if I were at the point of writing
code; but apparently Access has no way of manually starting or ending a
transaction. So far, I've been doing my testing by removing the
relationships, adding test data, and recreating the relationships
afterward - which is a pain in the neck.

PS Great title for a post

I'm glad it caught your attention!

Phil · Sep 23, 2007

Pat said:
You need to recreate the schema so that it supports a many-to-many
relationship. That will solve your constraint issue. The constraints can't
be bi-directional or you'll never be able to add new parents to a table
since its children don't exist.

Okay, but I'm not sure what I need to do differently.

FYI, it doesn't do you any good to shelter us from your real application

Sorry, I didn't realize that'd be more of an obstacle than a help.

If someone could show me how to export the DB table structure and
relationships as SQL, I'll post what I've done so far here. I was sure
I'd done this the last time I worked with Access, but I can't seem to
find the option to do it now.

Jamie Collins · Sep 23, 2007

SUV, Sedan, etc. is an attribute of Vehicle.

That's not under dispute. In my example scenario, SUV, Sedan, etc are
subclasses of the superclass Vehicle; in SQL-langauge terms SUV
REFERENCES Vehicles and Vehicles REFERENCES BodyTypes. For a close
analogy, think OOP inheritance: an SUV "is a" Vehicle. That's not a
parent-child relationship. Perhaps the OP's are not parent-child
either.

Jamie.

--

Jamie Collins · Sep 24, 2007

Actually, Pat's earlier message convinced me that what I have IS an m2m
relationship, only I wasn't seeing it that way.

I'm still not not seeing it that way but it's your model

I'm not 100% sure what that means, but I'm pretty sure I could work
around my problem with transactions - if I were at the point of writing
code; but apparently Access has no way of manually starting or ending a
transaction. So far, I've been doing my testing by removing the
relationships, adding test data, and recreating the relationships
afterward - which is a pain in the neck.

You can indeed explicitly start a transaction e.g. ANSI-92 Query Mode
syntax supports BEGIN TRANSACTION, COMMIT TRANSACTION and ROLLBACK
TRANSACTION. The problem you'll have is that instead of being able to
defer constraints (possible with full SQL-92 syntax SET CONSTRAINTS
<constraint names list> DEFERRED) you would have to DROP the
constraints within the transaction, which would cause a table lock.
Further note that a Jet PROCEDURE can only comprise one SQL statement,
whereas the above logic requires at least six SQL statements e.g.
pseudo code:

BEGIN TRANSACTION
;
ALTER TABLE MyTable DROP CONSTRAINT my_constraint
;
-- sql update(s) here e.g.
INSERT INTO MyTable (my_col) VALUES (0)
;
ALTER TABLE MyTable ADD CONSTRAINT my_constraint CHECK (...);
;
-- adding a constraint does not necessarily mean
-- it will get checked, therefore do something to
-- trigger it here e.g.
UPDATE MyTable
SET my_col = my_col
;
COMMIT TRANSACTION
;

The mechanisms to execute multiple statements in appropriate order,
control of flow, error handling etc must be done in the 'front end'
rather than SQL code

Jamie.

--

Jamie Collins · Sep 24, 2007

you would have to DROP the
constraints within the transaction, which would cause a table lock.

....and you'd have to grant schema privileges on your tables to users,
which is far from ideal IMO.

Jamie.

--

Pat Hartman \(MVP\) · Sep 24, 2007

In the example I used, my items appeared in only ONE list. That is why the
relationship could be supported by a single self referencing table. When
items can appear in multiple "lists", you need a minimum of two tables.
Generally m-m relationships require three tables but this one only requires
two tables. Table1 holds the items with no reference to where they are
used and Table1 connects items together to form "lists". So, all the items
grouped together with the same "parent", become part of a single list. This
model also allows you to nest lists similar to the way a Bill of Material is
implemented. If you don't need to nest lists, I would go for the clearer
three table set up which leaves no doubt about how things are used and
prevents nesting-
tblList
ListID (PK)
NameOfList
etc.
tblItem
ItemID (PK)
ItemName
etc

tblListItem
ListID (PK field1)
ItemID (PK field2)
any intersection data

I don't generally add autonumbers to relation tables unless the relation
table will have dependent child tables.

Too much integrity?

Phil

Pat Hartman \(MVP\)

Jamie Collins

Phil

Pat Hartman \(MVP\)

Jamie Collins

Pat Hartman \(MVP\)

Phil

Phil

Jamie Collins

Jamie Collins

Jamie Collins

Pat Hartman \(MVP\)