Inefficient one-to-many or one huge table

C

cjg.groups

Should I use two tables related one-to-many with lots of duplicate
data? Or should I reduce redundancy into a single 100+ field table?
And could I break that 100+ field table into two tables, related
one-to-one?

The goal is to store Samples and their analysis Results. The Samples
table contains information about each sample that was collected. Lab
analysis generates a spreadsheet of results for each sample. The sheet
has 7 information fields and 1 result field for 35 parameters (one
parameter per row); a 35Rx8C sheet.

This looks like a two-table one-to-many design, but seems inefficient.
The information fields contain the same data for most, but not all
parameters (records/rows); some need unique information data. To begin
entering result data, my Access form would need to allocate 35 records
and prepopulate many of their fields. Or is there another way?

To reduce duplicate data, I've designed a table which stores the entire
sheet on one row. I've grouped similar parameters to share similar
data, leaving VB to reassemble the 35x8 table upon TXT output, which
saved almost 150 cells of data. This design gives me Samples and
Results tables in a one-to-one relationship, which seems pointless
(though tidy). I could combine them to one table with 100+ fields, but
is that too bulky? And is that bad design since Samples and Results
contain somewhat different types of information?

Or is there an entirelly different way to store this information?
Thanks for your input.
 
T

tina

do the information fields contain "standard" data about parameters? in other
words, for parameter 17, is the data in those 7 information fields the same
whether you're talking about sample 238 or sample 856 or sample 1014? if
yes, then suggest three tables:

tblParameters
ParamID (primary key)
<list each of the seven "information" fields that describe a specific
parameter.>

tblSamples
SampleID (pk)
SampleDate
<any other fields that describe a specific sample.>

tblTestResults
ResultID (pk)
SampleID (foreign key from tblSamples)
ParamID (fk from tblParameters)
TestResult

relationships are:
tblSamples.SampleID 1:n tblTestResults.SampleID
tblParameters.ParamID 1:n tblTestResults.ParamID

there are somewhat rare occasions when normalization rules may be broken to
facilitate the accurate and/or efficient handling of data - but your
situation does not sound like one of those. generally speaking, table design
should never be determined by the needs of the user interface. recommend you
read up on normalization principles; see
http://home.bendbroadband.com/conradsystems/accessjunkie/resources.html#DatabaseDesign101
for numerous good sources.

hth
 
C

cjg.groups

tina,

Thank you for the suggestion. So, do you suggest that tblParameters
has 35 records, one for each parameter, and doesn't change? This works
for about 20 parameters, but 15 require some of their information
fields to change for each sample. I could store those changing fields
in tblTestResults, but it would cause duplication for the 15 parameters
that use the same data for that field.

I've thought of two other ideas.
One idea is: For the 15 similar parameters, don't even store the result
unless it's non-zero, which is rare. If its non-zero, store all 8
fields in an "override" table. This requires the two output programs
to check the override table 15 times as it outputs to see if the
parameter was non-zero. Too complex.

Another idea is: Use one table for samples and results with one record
per sample/results with many many fields. Reduce the number of fields
by hardcoding much of the data into the output code. These fields
change less than once a year, so it seems pointless to store. I could
document the hardcoded values when they change, make a table of
hardcoded values and associate each sample/result with it, or make some
global arrays of hardcoded values and swap them as necessary.

Is it possible to share a hardcoded array with all the queries and VB
code in the Access database? Thanks.
 
G

Guest

You have 35 parameters. These don't change. However, if you ever need to
add any more parameters, it is easy to add them to this table.

tbl_Parameters
ParameterID (PK)
ParameterInfo

You have 7 types of basic information, with the need for unique or special
information. You can use the SpecialInfo field to designate if it is a
unique or special information.

tbl_Information
InformationID (PK)
InformationInfo
SpecialInfo (Yes/No) (default to No)

This is all of your samples.

tbl_Samples
SampleID (PK)
SampleInfo

This table ties everything together. There will be a record for each test
result. This will end up being a very large table.

tbl_TestResults
TestResultID (PK)
ParameterID (FK)
InformationID (FK)
SampleID (FK)
TestResultInfo

You will then link all of the foreign keys in the TestResults table to the
primary keys in the other tables in the relationships window. Set up a query
with everything in it, and use that to build your forms and subforms to see
all of your data. That can get a little tricky, but by using tabbed windows
you can separate various data that belong to a particular sample. In this
way, you could have a separate tabbed window to display the unique
information, for instance.
 
T

tina

comments inline.

tina,

Thank you for the suggestion. So, do you suggest that tblParameters
has 35 records, one for each parameter, and doesn't change? This works
for about 20 parameters, but 15 require some of their information
fields to change for each sample.

if parameter information changes according to the sample, then it sounds
like the information describes a specific instance of a parameter for a
specific sample; that makes the information sample-specific, not merely
parameter-specific.
I could store those changing fields
in tblTestResults, but it would cause duplication for the 15 parameters
that use the same data for that field.

you need to distinguish between *repeating* data, and *duplicate* data. in a
relational database, they're not the same thing. if you have a table of 100
Customers, and you store each customer's birthdate in that table AND you
also store the customer birthdate in every record in the CustomerOrders
table - that's duplication of data. but if you have a table of 100
Customers, and you store the delivery date in each record in a related
CustomerOrders table, and many customers request delivery on the same date -
then you'll have *repeating* values in many orders records, but it is *not a
duplication* of data.

i don't know enough about the process you're trying to model, the
entities/events and their characteristics, to give you more specific
recommendations - and i doubt that sufficient analysis can be made in this
forum. all i can suggest is that you make further study of the principles of
data modeling, and make a thorough analysis of your process; when you've
done that, the correct assignment of fields (characteristics) to tables
(entities and/or events) should become easier for you to determine. one good
text for learning data modeling is Database Design for Mere Mortals by
Michael Hernandez, and you can find numerous links to the subject at
http://home.bendbroadband.com/conradsystems/accessjunkie/resources.html#DatabaseDesign101

hth
 
C

cjg.groups

Thank you all for your help. I have bookmarked this thread and will
review it to help me clean up the table design for future versions of
this database. For now, I found ways to reduce "repeating values" by
making some reasonable assumptions. Instead of storing 400 cells of
data per sample, I am storing only 68 cells. Ultimately, I realized
some of the information fields were related to the lab that handled the
samples; an opportunity for a second table.

Thanks again for your help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top