HELP! referential integrity multiple use of same table

G

Guest

I will try to simplify the problem. I have two tables called samples and
Isolates, and would like to store pictures for both. I can join these each
with the same photo table on the many side and it works fine, UNLESS i use
referential integrity, in which case if i fill out sample than add a photo to
it i get an access message saying that i have to fill out isolate also
(actually make sence!). Do i have to create 8 different tables to store
photos or can i somehow use the same photo table for storing all photos in
the DB and use RI?

many thanks...
 
V

Vincent Johns

salmonella said:
I will try to simplify the problem. I have two tables called samples and
Isolates, and would like to store pictures for both.

Can you store several (at least 2) pictures for one record in [Samples]?
Can you do that for [Isolates] as well?
I can join these each
with the same photo table on the many side

Many side of what? Can there be several [Samples] records for each
[Isolates] record?
and it works fine, UNLESS i use
referential integrity, in which case if i fill out sample than add a photo to
it i get an access message saying that i have to fill out isolate also
(actually make sence!). Do i have to create 8 different tables to store
photos or can i somehow use the same photo table for storing all photos in
the DB and use RI?

many thanks...

How you set up your Tables depends on your requirements. From what you
say, I'll guess that a photo is not required and can be added after the
record is. Is your [Photos] Table set up so that each record contains a
key linking it to [Isolates] and another key linking the record to
[Samples]? You could do that if you turn off referential integrity
(just for the [Photos] records) and allow some of those keys to have
null values. [Isolates] and [Samples] could still be linked using
referential integrity.

Another way to link them would be to have one [Photos] Table, which
would contain fields identifying the file names of the actual
photographs and maybe some descriptive information and to have two other
Tables referring to it. The [IsolatesPhotos] Table could contain a
field that links (many to one) to [Isolates] and another field linking
to [Photos] (one to one, but not all [Photos] records would be linked to
this Table). The [SamplesPhotos] Table would contain similar links,
except to [Samples] instead of [Isolates], and some of the [Photos]
records would be linked (one to one) to [SamplesPhotos]. The records in
[IsolatesPhotos] and [SamplesPhotos] would be small, each containing
nothing except two or three links, and you could enforce referential
integrity on these. Would that be closer to what you want to do?

I suppose I should mention that maintaining referential integrity may be
slightly overrated. It can save time, if you turn on "Cascade Delete",
for example, but having consistent records does not guarantee that you
have correct records. For that you need careful data entry.

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
P

peregenem

Vincent said:
I should mention that maintaining referential integrity may be
slightly overrated. It can save time, if you turn on "Cascade Delete",
for example, but having consistent records does not guarantee that you
have correct records. For that you need careful data entry.

You need to maintain referential integrity *and* data integrity. A
CASCADE adds overhead i.e. *adds* time to a DELETE operation. But the
price is definitely worth paying.
having consistent records does not guarantee that you
have correct records. For that you need careful data entry.

No, careless data entry is inevitable. You need appropriate constraints
in the database, e.g. CHECK constraints (a.k.a Validation rules), to
prevent bad data from entering the database when careless data entry
(or careless front end validation) happens e.g.

CREATE TABLE Test (
last_name VARCHAR(35) NOT NULL,
CONSTRAINT last_name_not_zero_length
CHECK (LEN(last_name) > 0),
CONSTRAINT last_name_no_double_spaces
CHECK (last_name NOT LIKE '% %'),
CONSTRAINT last_name_no_leading_space
CHECK (last_name NOT LIKE ' %'),
CONSTRAINT last_name_no_trailing_space
CHECK (last_name NOT LIKE '% ')
)

You could combine some of these constraints but keeping them separate
and giving then meaningful names makes for better feedback when they
bit e.g.

Con.Execute "INSERT INTO Test VALUES (' Egenem')"
? Con.Errors(0).Description
One or more values are prohibited by the validation rule
'last_name_no_leading_space' set for 'Test'. Enter a value that the
expression for this field can accept.
 
G

Guest

Hi john,
I see you are from Caltech. Did a postdoc there, nice place.

Anyway, to answer your first two questions, there is one sample_site from
which many samples are taken and one sample from which many bacterial
isolates are found, and so on. For each of these “backbone†tables there
would be many records for photos, tests, etc. For example, 5 photos taken of
a sample and 6 tests done on a sample or isolate. Thus I have “sample†on the
“one†side with photos on the many side and I do the exact same thing with
the other “backbone†tables like culture and bacterial isolate. So what I am
doing is collecting a set of data (photos, tests, etc) about a sample_source
then move to sample and take the exact same data types (photos, tests, etc),
then move to culture, and the same date types right on down the “backboneâ€. I
have them joined where the same foreign key in photo is joined to each PK in
the tables sample, isolate, etc.


I really find this interesting. For Laboratory work, there is redundancy in
the data you collect at different stages of a process. It appears that this
type of redundancy (the concept of taking the same types of data at different
steps in a process- where each step must be a table in chain or backbone of
joined tables) that is not common for most other databases themes.

I think you are right, it simply cannot be easily done and the best I can do
is to implement integrity across some joins but not others (the other
suggestion was not compatible with the collection of the data, ie descriptive
names, etc). This of course could create orphan records if someone deleted a
record, however it is hard to imagine anyone actually deleting scientific
data so I doubt that it will not be a major problem.

By the way, I assume that the other way to have RI and take repetitive types
of data at each step along a process is just to make individual tables and
forms for each step, however that makes things too confusing and balloons the
db. For example, this would add about 50 extra tables to my db which is in my
mind not worth the loss of RI across some joins.

One other thing that I don’t understand, what is the purpose of being able
to add more than one copy of the same table to the Relationship window? For
example, if I have tblPhoto and I add it again, I get tblPhoto_1???

Thanks a lot for your time and suggestions.


Vincent Johns said:
salmonella said:
I will try to simplify the problem. I have two tables called samples and
Isolates, and would like to store pictures for both.

Can you store several (at least 2) pictures for one record in [Samples]?
Can you do that for [Isolates] as well?
I can join these each
with the same photo table on the many side

Many side of what? Can there be several [Samples] records for each
[Isolates] record?
and it works fine, UNLESS i use
referential integrity, in which case if i fill out sample than add a photo to
it i get an access message saying that i have to fill out isolate also
(actually make sence!). Do i have to create 8 different tables to store
photos or can i somehow use the same photo table for storing all photos in
the DB and use RI?

many thanks...

How you set up your Tables depends on your requirements. From what you
say, I'll guess that a photo is not required and can be added after the
record is. Is your [Photos] Table set up so that each record contains a
key linking it to [Isolates] and another key linking the record to
[Samples]? You could do that if you turn off referential integrity
(just for the [Photos] records) and allow some of those keys to have
null values. [Isolates] and [Samples] could still be linked using
referential integrity.

Another way to link them would be to have one [Photos] Table, which
would contain fields identifying the file names of the actual
photographs and maybe some descriptive information and to have two other
Tables referring to it. The [IsolatesPhotos] Table could contain a
field that links (many to one) to [Isolates] and another field linking
to [Photos] (one to one, but not all [Photos] records would be linked to
this Table). The [SamplesPhotos] Table would contain similar links,
except to [Samples] instead of [Isolates], and some of the [Photos]
records would be linked (one to one) to [SamplesPhotos]. The records in
[IsolatesPhotos] and [SamplesPhotos] would be small, each containing
nothing except two or three links, and you could enforce referential
integrity on these. Would that be closer to what you want to do?

I suppose I should mention that maintaining referential integrity may be
slightly overrated. It can save time, if you turn on "Cascade Delete",
for example, but having consistent records does not guarantee that you
have correct records. For that you need careful data entry.

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
G

Guest

good point, thanks for the sugestion!




You need to maintain referential integrity *and* data integrity. A
CASCADE adds overhead i.e. *adds* time to a DELETE operation. But the
price is definitely worth paying.


No, careless data entry is inevitable. You need appropriate constraints
in the database, e.g. CHECK constraints (a.k.a Validation rules), to
prevent bad data from entering the database when careless data entry
(or careless front end validation) happens e.g.

CREATE TABLE Test (
last_name VARCHAR(35) NOT NULL,
CONSTRAINT last_name_not_zero_length
CHECK (LEN(last_name) > 0),
CONSTRAINT last_name_no_double_spaces
CHECK (last_name NOT LIKE '% %'),
CONSTRAINT last_name_no_leading_space
CHECK (last_name NOT LIKE ' %'),
CONSTRAINT last_name_no_trailing_space
CHECK (last_name NOT LIKE '% ')
)

You could combine some of these constraints but keeping them separate
and giving then meaningful names makes for better feedback when they
bit e.g.

Con.Execute "INSERT INTO Test VALUES (' Egenem')"
? Con.Errors(0).Description
One or more values are prohibited by the validation rule
'last_name_no_leading_space' set for 'Test'. Enter a value that the
expression for this field can accept.
 
B

Bruce Rusk

Without knowing exactly what's involved, it sounds like there's no reason to
make 50 tables, one for each stage of the test. Assuming you're performing
the same type of test (i.e., the results take the same form each time), you
could have a single table for the results from each stage:

tblTests
SampleID (FK from Sample table)
StageID (FK from a table listing the possible stages)
TimeOfTesting
Results
..
..
..


salmonella said:
Hi john,
I see you are from Caltech. Did a postdoc there, nice place.

Anyway, to answer your first two questions, there is one sample_site from
which many samples are taken and one sample from which many bacterial
isolates are found, and so on. For each of these "backbone" tables there
would be many records for photos, tests, etc. For example, 5 photos taken
of
a sample and 6 tests done on a sample or isolate. Thus I have "sample" on
the
"one" side with photos on the many side and I do the exact same thing with
the other "backbone" tables like culture and bacterial isolate. So what I
am
doing is collecting a set of data (photos, tests, etc) about a
sample_source
then move to sample and take the exact same data types (photos, tests,
etc),
then move to culture, and the same date types right on down the "backbone".
I
have them joined where the same foreign key in photo is joined to each PK
in
the tables sample, isolate, etc.


I really find this interesting. For Laboratory work, there is redundancy
in
the data you collect at different stages of a process. It appears that
this
type of redundancy (the concept of taking the same types of data at
different
steps in a process- where each step must be a table in chain or backbone
of
joined tables) that is not common for most other databases themes.

I think you are right, it simply cannot be easily done and the best I can
do
is to implement integrity across some joins but not others (the other
suggestion was not compatible with the collection of the data, ie
descriptive
names, etc). This of course could create orphan records if someone deleted
a
record, however it is hard to imagine anyone actually deleting scientific
data so I doubt that it will not be a major problem.

By the way, I assume that the other way to have RI and take repetitive
types
of data at each step along a process is just to make individual tables and
forms for each step, however that makes things too confusing and balloons
the
db. For example, this would add about 50 extra tables to my db which is in
my
mind not worth the loss of RI across some joins.

One other thing that I don't understand, what is the purpose of being able
to add more than one copy of the same table to the Relationship window?
For
example, if I have tblPhoto and I add it again, I get tblPhoto_1???

Thanks a lot for your time and suggestions.


Vincent Johns said:
salmonella said:
I will try to simplify the problem. I have two tables called samples
and
Isolates, and would like to store pictures for both.

Can you store several (at least 2) pictures for one record in [Samples]?
Can you do that for [Isolates] as well?
I can join these each
with the same photo table on the many side

Many side of what? Can there be several [Samples] records for each
[Isolates] record?
and it works fine, UNLESS i use
referential integrity, in which case if i fill out sample than add a
photo to
it i get an access message saying that i have to fill out isolate also
(actually make sence!). Do i have to create 8 different tables to store
photos or can i somehow use the same photo table for storing all photos
in
the DB and use RI?

many thanks...

How you set up your Tables depends on your requirements. From what you
say, I'll guess that a photo is not required and can be added after the
record is. Is your [Photos] Table set up so that each record contains a
key linking it to [Isolates] and another key linking the record to
[Samples]? You could do that if you turn off referential integrity
(just for the [Photos] records) and allow some of those keys to have
null values. [Isolates] and [Samples] could still be linked using
referential integrity.

Another way to link them would be to have one [Photos] Table, which
would contain fields identifying the file names of the actual
photographs and maybe some descriptive information and to have two other
Tables referring to it. The [IsolatesPhotos] Table could contain a
field that links (many to one) to [Isolates] and another field linking
to [Photos] (one to one, but not all [Photos] records would be linked to
this Table). The [SamplesPhotos] Table would contain similar links,
except to [Samples] instead of [Isolates], and some of the [Photos]
records would be linked (one to one) to [SamplesPhotos]. The records in
[IsolatesPhotos] and [SamplesPhotos] would be small, each containing
nothing except two or three links, and you could enforce referential
integrity on these. Would that be closer to what you want to do?

I suppose I should mention that maintaining referential integrity may be
slightly overrated. It can save time, if you turn on "Cascade Delete",
for example, but having consistent records does not guarantee that you
have correct records. For that you need careful data entry.

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
V

Vincent Johns

Bruce said:
Without knowing exactly what's involved, it sounds like there's no reason to
make 50 tables, one for each stage of the test. Assuming you're performing
the same type of test (i.e., the results take the same form each time), you
could have a single table for the results from each stage:

tblTests
SampleID (FK from Sample table)
StageID (FK from a table listing the possible stages)
TimeOfTesting
Results
...

This looks good to me. I could go into more detail if I had time, but
let me say right now that if the records are similar to each other, they
should go into one Table, with a field (such as Bruce Rusk's [StageID])
that you can use to distinguish the types.

If the record contents are DISsimilar, such as photographs vs. sets of
numerical test results, then they can go into separate Tables, linked to
the governing record via a foreign key in each. The Tables organize the
data, and the keys collect them. You can have lots of keys in a single
record, with differing purposes.

To answer your question about 2 copies of a Table in one Query, consider
an [Employees] Table, in which you have a field identifying a
supervisor, who is also an employee. Same Table, two purposes. You
need a different name for each (virtual) copy of the Table. And you
have the option of renaming the copy; instead of [Employees_1] it could
be [Emp_Bosses].

Sorry, gottago.

BTW, your name reminds me of a "Funky Winkerbean" comic strip in which
the high-school band's fall fund raiser consisted of selling
Thanksgiving turkeys from Sam 'n' Ella's Turkey Ranch. :)

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top