Design assistance needed (badly)

G

Guest

I know this question has to do with Data Normalization but the concept of
exactly how to do it still eludes me. I need to create a database that will
be able to track a lot of students and their test scores. For example:

Students in the 3rd – 6th grades take standardized tests in Language Arts,
Math, Science, and History each year. What I would like to do is keep track
of each test score each student achieved every year they took the tests.
I’ll need to enter the test data for a specific student for a specific school
year, say for the 2006-2007, then 2007-2008 school years for each test. In
the end I would like to be able to query, for example, which students in the
5th grade for the year 2006-2007 scored >70% on the Math test. Or, which
students in the 4th grade for 2007-2008 scored <60% on the Science test. Or,
I would like to query all of the Science test scores for Student A for all
the years this student took the Science test. Or, I would like to query and
view all of the test scores for Student A for all the years they took the
tests. These are examples, but you can see that I’d really like to be able
to study the data in many ways, both for large groups of students for certain
years or many years, as well as for an individual student for a certain year
or many years. I think that creating the queries will be the easy part.
What I can’t figure out is how many table do I need and what information do
the tables contain.

I have one student information table that contains the following information:

Last Name
First Name
Student Number
Current Grade Level
{and a little bit more specific information about that student}

The Student Number is the unique field and I assume my primary key. I
figure I will use the Student Number to create relationships between the
Student table and the other tables that contain the test scores for each year.

Ultimately, my question comes down to how do I design the other tables in a
logical way so that it makes sense? Do I create an individual table for each
grade level for each year and include all of the test scores using the
Student Number as the common connection? Or, should I create a separate
table for each test and somehow connect each score and school year to each
student? I can’t determine if I need lots of little tables that hold just a
small portion of the whole or fewer big tables that hold a much larger
portion of the whole.

Any thoughts you may have that could get me started in a good and reasonable
direction would be greatly appreciated.
 
G

Guest

Well, for what it's worth, here are some thoughts. You may get some responses
with better ideas, but hopefully this will at least give you some things to
think about.

A student can take many tests. Likewise, a test can have many students. So
you have a many-to-many relationship between students and tests. That means
you will need three tables to define the relationship. It might look
something like the following;

tblStudents
********
StudentID (Primary Key)
LastName
FirstName
DateOfBirth
(other fields related to student)

tblTests
*******
TestID (PK)
TestName
TestType
(other fields related to tests)

tblStudentTests
************
StudentTestID (PK)
StudentID (Foreign Key)
TestID (Foreign Key)
TestDate
StudentGrade (grade at the time the test was taken)
TestScore

You could then use queries to get results based on thing like;

a specific student or group of students

a specific test or tests

a specific date or a range of dates

a specific grade level or levels

certain test scores

etc.

A couple of things to note here. You might notice that I didn't include a
field for current grade level in tblStudents. That's because it is informaton
that changes every year and it can be calculated based on the students DOB.
You would not need to store it in a table, it would just be calculated and
displayed on you forms, reports, etc. If you store in a table field, then you
will need to go back and change it every school year, either manually or via
an update query or something.

On the other hand, I did put a grade level field in tblStudentTests. That's
because this is static information. When Johnny took the math test on
10/09/07 he was in Fifth grade. That is a fact that will never change. Some
might say that this is a value that could also be calculated by comparing the
DOB to the Test Date, but it seems easier to me just to store the value in
the table

You may also note that I don't list a table for grade levels. Again, that's
because a students grade can be calculated from their DOB, However, you may
have other reasons for needing a table that holds info about grade levels.
You may also need other tables related to teachers, administrative staff,
etc. I have no way of knowing how in depth your DB is supposed to be, so I
can't say.

Anyway, hope this helps and good luck
 
T

tina

something along the lines of

tblStudents
StudentID (primary key)
FirstName
LastName
<other fields that describe a student. do not put "current grade level" in
this table, though; read on.>

tblSchoolYears
YearID (primary key)
SchYear
< this table will hold one record each for "2005/06", "2006/07", etc>

tblSubjects
SubjectID (primary key)
SubjectName
<one record each for "Math", "Science", etc>

tblStudentTests
StuTest_ID (primary key)
YearID (foreign key from tblSchoolYears)
StudentID (foreign key from tblStudents)
GradeLevel
<this is a more appropriate place for grade level, because each student is
tested during a specific school year and *at that point in time* is in a
specific grade.
from a data entry standpoint, i imagine it makes more sense as well; for
example, i can picture the poor user(s) getting a stack of math tests from
all the "fifth grade" classes, and entering the scores for each student. in
that scenario, grade level is readily apparent.>

one student may be tested over many years, and one year may have many
students tested. that's a many-to-many relationship, resolved by the linking
table tblStudentTests:
tblSchoolYears.YearID 1:n tblStudentTests.YearID
tblStudents.StudentID 1:n tblStudentTests.StudentID

tblTestScores
ScoreID (primary key)
StuTestID (foreign key from tblStudentTests)
SubjectID (foreign key from tblSubjects)
Score

also, one student may be tested in many subjects in a given school year, and
many students may be tested in one subject. there's another many to many
relationship, resolved by the linking table tblTestScores:

tblStudentTests.StuTestID 1:n tblTestScores.StuTestID
tblSubjects.SubjectID 1:n tblTestScores.SubjectID

the above structure is completely flexible; you can add student records,
school year records, and subject records as needed; and record as many test
scores for as many students in as many subjects over as many years, as you
want.

recommend you read up some more on relational design principles; see
http://www.accessmvp.com/JConrad/accessjunkie/resources.html#DatabaseDesign101.

hth
 
G

Guest

First of all, thanks for your assistance! It is not only greatly appreciated
but it has helped. I have a question based on your suggestion.

You suggested using a field TestID. It feels somewhat redundant to have a
TestID and a TestName in that it seems like there are two designations for
one test. If I have a test called Math3 (3rd grade math test) or Science5
(5th grade science test) that seems to be enough for me to be able to
identify that particular test without having to assign a second designation
for the same test. Is there a particular reason why this would be better
than using TestName as my primary key and matching that to the same field in
tblStudentTests (Foreign Key)?

My guess is you may have suggested what you did based on my less than
understandable explanation of what I need to accomplish. If that's the case
it is my fault. I'm just wondering if there is a specific reason why it
would make better sense to have both instead of just one.

Still a Messy Guy
 
G

Guest

As I mentioned to Beetle, thank you very much for your help with this. Just
having the two different suggestions to look at and compare has helped give
me many things to consider and think about (which is a good thing)!

I have a similar question to ask that may have a similar answer from your
end in that it is very likely that my original question was not clear enough
for you to get a better idea of what I'm trying to accomplish. Again, that's
my fault.

In the case of your suggestion it sometimes feels like I have to create
field data in order to have primary/foreign keys that can match up to make
the relationships. Is there a specific reason that what feels like extra
fields are necessary to make this all work better? As long as I can match
key fields and make the relationships does it make a difference how it is
done?

For example, in tblSubjects there are two fields (SubjectID & SubjectName).
Could I get away with only having one field (SubjectName), making that my
primary key, and making it a foreign key in tblTestScores? I do have a real
subject name (Math, Science, Reading, etc.) but I would be making up an
artifical SubjectID if I had a second field in tblSubjects.

I gather you made the suggestion(s) you did for a reason, so that's what I'm
wondering about.

By the way, thanks for the link to the site for more info about design. I
have checked it out and there is lots there for me to look at--a process I
have begun, so I really appreciate the extra info.

Messy


tina said:
something along the lines of

tblStudents
StudentID (primary key)
FirstName
LastName
<other fields that describe a student. do not put "current grade level" in
this table, though; read on.>

tblSchoolYears
YearID (primary key)
SchYear
< this table will hold one record each for "2005/06", "2006/07", etc>

tblSubjects
SubjectID (primary key)
SubjectName
<one record each for "Math", "Science", etc>

tblStudentTests
StuTest_ID (primary key)
YearID (foreign key from tblSchoolYears)
StudentID (foreign key from tblStudents)
GradeLevel
<this is a more appropriate place for grade level, because each student is
tested during a specific school year and *at that point in time* is in a
specific grade.
from a data entry standpoint, i imagine it makes more sense as well; for
example, i can picture the poor user(s) getting a stack of math tests from
all the "fifth grade" classes, and entering the scores for each student. in
that scenario, grade level is readily apparent.>

one student may be tested over many years, and one year may have many
students tested. that's a many-to-many relationship, resolved by the linking
table tblStudentTests:
tblSchoolYears.YearID 1:n tblStudentTests.YearID
tblStudents.StudentID 1:n tblStudentTests.StudentID

tblTestScores
ScoreID (primary key)
StuTestID (foreign key from tblStudentTests)
SubjectID (foreign key from tblSubjects)
Score

also, one student may be tested in many subjects in a given school year, and
many students may be tested in one subject. there's another many to many
relationship, resolved by the linking table tblTestScores:

tblStudentTests.StuTestID 1:n tblTestScores.StuTestID
tblSubjects.SubjectID 1:n tblTestScores.SubjectID

the above structure is completely flexible; you can add student records,
school year records, and subject records as needed; and record as many test
scores for as many students in as many subjects over as many years, as you
want.

recommend you read up some more on relational design principles; see
http://www.accessmvp.com/JConrad/accessjunkie/resources.html#DatabaseDesign101.

hth
 
G

Guest

I put the TestID field in there just because that's my standard practice for
table design. You could use TestName as the PK as long as you're certain that
you will never need to reuse any of the names. For example, let's say that
next year you change the questions on the math test. To my way of thinking,
that would technically make it a different test. Now let's suppose that you
still wanted to call it Math3. If you use TestName as the PK, then Math3
would already be taken and you wouldn't be able to reuse it. If you have
TestID as the PK, then you could reuse the name Math3, and perhaps have
another field in the table that notes the difference between the Math3 tests.
Maybe this situation would never come up for you, but it's an example of the
type of thing you should take into consideration if you are going to use
TestName as your PK.

Another reason I like to have a numeric PK field is that they are a little
easier to deal with when writing Visual Basic code. You may not yet have
reached the point in your DB where you need to do any coding, but sooner or
later you probably will.

HTH
 
M

Michael Gramelspacher

CREATE TABLE Tests
(test_name VARCHAR (50) NOT NULL,
version_date DATETIME NOT NULL,
<other columns>,
PRIMARY KEY (test_name, version_date));

or

CREATE TABLE Tests
(test_id COUNTER NOT NULL,
(test_name VARCHAR (50) NOT NULL,
version_date DATETIME NOT NULL,
<other columns>,
UNIQUE (test_name,version_date),
PRIMARY KEY (test_id));

If it is necessary to keep multiple versions of the same tests, then
you need to have a version number or version date column to tell the
tests apart. Using an autonumber column alone to enforce uniqueness
just does not get it. You also need a unique constraint on the
natural key columns.
 
T

tina

one of the attributes of a primary key is that it is stable; it will never
change. when you use meaningful data as a primary key, it is impossible to
guarantee that you will never need to change a primary key value. trust me,
even in simple tables where you're sure you'll never need to change key
data, it'll happen. for that reason, and others, i *almost* always use a
surrogate primary key in my tables, rather than a natural key as primary.

this subject has been hotly debated in these newsgroups many times, so you
can google the groups if you want to read up on both sides of the debate. to
read a well-written paper on the subject, go to
http://www.dbpd.com/vault/9805xtra.htm.

hth


AMessyGuy said:
As I mentioned to Beetle, thank you very much for your help with this. Just
having the two different suggestions to look at and compare has helped give
me many things to consider and think about (which is a good thing)!

I have a similar question to ask that may have a similar answer from your
end in that it is very likely that my original question was not clear enough
for you to get a better idea of what I'm trying to accomplish. Again, that's
my fault.

In the case of your suggestion it sometimes feels like I have to create
field data in order to have primary/foreign keys that can match up to make
the relationships. Is there a specific reason that what feels like extra
fields are necessary to make this all work better? As long as I can match
key fields and make the relationships does it make a difference how it is
done?

For example, in tblSubjects there are two fields (SubjectID & SubjectName).
Could I get away with only having one field (SubjectName), making that my
primary key, and making it a foreign key in tblTestScores? I do have a real
subject name (Math, Science, Reading, etc.) but I would be making up an
artifical SubjectID if I had a second field in tblSubjects.

I gather you made the suggestion(s) you did for a reason, so that's what I'm
wondering about.

By the way, thanks for the link to the site for more info about design. I
have checked it out and there is lots there for me to look at--a process I
have begun, so I really appreciate the extra info.

Messy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top