advanced queries for a student portfolio

A

AndREa

Hi All,
I'm writing a "majors portfolio" application. I want to
track the aggregate information about our majors. I have
the student information in a (normalized) format where I
can retrieve data like studentKey took MATH course 1720 in
the Fall of 2003 and got an A+ grade from records like
KEY1 | MATH | 1720 | Fall | 2003 | A+
KEY2 | CPSC | 2731 | Fall | 2000 | C

From this data I want to get aggregate information like

* What is the first math course that our students take? I
know which courses are math course (course field = MATH) I
want to write a query that given a particular MATH course
it returns the records of students who _started_ in that
course. It's not necessary to return all students who
took the course, I want to know who had no MATH course
before taking the input course.

* How many students passed Course1 on the Mth time and
Course2 on the Nth time? An example of this is "who
passed Course1 the first time but took 2 times to pass
Course2?" [I know a student passed if they got an A,B,C.]

* What are the chances that a student starting in Course1
passes Course2? This query is not "who passed both
Course1 and Course2" but is instead "given that you
started in Course1 what are your chances of passing
Course2".

Please recommend a book that discusses advanced query
techniques. And of course, any help is much appreciated,
* AndREa * ([email protected])
 
J

John Spencer (MVP)

I would recommend
SQL Queries for Mere Mortals
by Michael Hernandez and John Viescas

In addition, this group is a good source for answering questions on queries. I
would recommend that after you've attempted to solve a problem, you post ONE
query problem to the group. If you post several at one time, you are less
likely to get an answer.

Looking at your record sample I do have one quick suggestion. You might
consider changing the Semester (Fall) to a number value or storing a full date
for the semester start instead of just the year number. It would be hard to
determine (not impossible) the most recent course for a student with the records
Fall 2003
Spring 2003
Summer 2002
Summer 2003
Etc.

You could have a separate table for semesters with a primary key, a start date,
a semester name (Fall, Spring, Mid-Winter, Summer, Inter-Session I,
Inter-session II, etc) and even an Academic year. That would make it a lot
easier to determine the most recent course, the first course, the first course
in an academic year, the last course in the calendar year, etc.

As far as your first request, given your current structure the following
UNTESTED SQL might work assuming you only have two semesters Fall and Spring.

Select StudentKey, Semester, CourseYear
FROM CourseEnrollees as A
WHERE CourseField="Math" and CourseNumber = "1720"
AND Exists (SELECT * FROM CourseEnrollees as tmp
WHERE (A.StudentKey = T.StudentKey
AND T.CourseField = "Math"
AND T.CourseNumber <> "1720")
AND
(T.CourseYear = A.CourseYear
AND T.Semester = "Fall" OR
T.CourseYear < A.CourseYear)) = False

And therein T.Semester < A.Semester lies a problem, since Fall, Spring, Summer
are not easily ordered. You would have to restructure your table or write a
custom function to assign a numeric value to the Semester string.


Hi All,
I'm writing a "majors portfolio" application. I want to
track the aggregate information about our majors. I have
the student information in a (normalized) format where I
can retrieve data like studentKey took MATH course 1720 in
the Fall of 2003 and got an A+ grade from records like
KEY1 | MATH | 1720 | Fall | 2003 | A+
KEY2 | CPSC | 2731 | Fall | 2000 | C

From this data I want to get aggregate information like

* What is the first math course that our students take? I
know which courses are math course (course field = MATH) I
want to write a query that given a particular MATH course
it returns the records of students who _started_ in that
course. It's not necessary to return all students who
took the course, I want to know who had no MATH course
before taking the input course.

* How many students passed Course1 on the Mth time and
Course2 on the Nth time? An example of this is "who
passed Course1 the first time but took 2 times to pass
Course2?" [I know a student passed if they got an A,B,C.]

* What are the chances that a student starting in Course1
passes Course2? This query is not "who passed both
Course1 and Course2" but is instead "given that you
started in Course1 what are your chances of passing
Course2".

Please recommend a book that discusses advanced query
techniques. And of course, any help is much appreciated,
* AndREa * ([email protected])
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top